Closed bboilot-ledger closed 2 months ago
Also, It would make sense to me to be able to store more than one ResultsDict
inside a file report (for example for all test corresponding to AES GCM with different key lengths). What do you think about adding the possibility of appending to the file or directly switching to append mode?
Hi @bboilot-ledger,
The expected behaviour is indeed the documented one, so this is a bug. Your fix looks good so I can commit it with a corresponding test to confirm it works, and add you as commit co-author if that's alright.
Regarding your second comment, I'm guessing the intended usage is repeated calls to process_results
in-between tests? Like:
res = AES.test(enc, dec, AES.Mode.GCM, AES.KeyLength.AES128)
console.process_results(res, "res.txt")
res |= AES.test(enc, dec, AES.Mode.GCM, AES.KeyLength.AES192)
console.process_results(res, "res.txt")
If so, my first answer is that you can use AES.KeyLength.ALL
to test all three key lengths at one. The resulting ResultsDict
should contain several Results
as values. This is the recommended approach since there is no file appending.
Otherwise, you should be able to test different primitives and parameters thanks to the |=
operator, which combines two dictionaries. The caveat with this approach is that keys must be unique or results will be overwritten (and I see that this does already happen with Wycheproof results with AES so I'll open an issue). The thing is that the keys are set by the test themselves, so there's been some work with ResultsDict.add
method to create unique keys.
The current approach uses information on the primitive and parameters selected to create the key, so it still does not allow to add tests for two different implementations and the same exact parameters in one ResultsDict
without overwriting. So if that's what you were referring to, or some other use case for appending the results to the file, we can discuss it further. :)
:wave: @JulioLoayzaM
Regarding the fix, it looks good to me :smiley:.
Yes, I was looking for a way to aggregate results of different primitives for one implementation (not the same primitive for different implementation). For me it would make sense if you're testing a cryptographic library to have one report that contains all the tested primitives, the global results like number of tests and then detailed results for each primitives.
I did not see AES.KeyLength.ALL
, cool feature! :+1:
However, if I try your more generic approach using the |=
operator (which is exactly what I am looking for), I am not able to get the correct results:
If I use the !=
operator, I get:
If I store different primitives ResultsDict
in an array and run console.process_results
on each element, I get:
It may be related to the problem you mentioned regarding AES GCM.
Also, I will have to check again but I feel like the classic report is not verbose regarding failing tests. Like you would have to run the test again in debug mode to understand which test (what were the inputs) failed. And in debug mode you also get that information for every passing test. But in an other hand, it may be too verbose if many tests are failing. So I don't actually know what to think about that...
Otherwise, the report format is super clear, I love the flag feature and the colorization to quickly see important information.
Hi @bboilot-ledger,
Regarding the fix, it looks good to me ๐.
Fix pushed :) I'll push a new release on monday.
For me it would make sense if you're testing a cryptographic library to have one report that contains all the tested primitives, the global results like number of tests and then detailed results for each primitives.
Ahh I see, that makes sense indeed. For the Python API, combining results should work with |=
, modulo the problem of key uniqueness. For the CLI, there currently isn't a way but I've been testing a simpler interface that would use a single wrapper to test several primitives and/or options.
However, if I try your more generic approach using the |= operator (which is exactly what I am looking for), I am not able to get the correct results:
That looks like a bug :P I've tried to reproduce with AES-ECB and SHA-256 but I do get the combined results. I used the following script:
from hashlib import sha256
from Crypto.Cipher import AES as pyAES
from crypto_condor.primitives import AES, SHA
from crypto_condor.primitives.common import Console
def _sha256(data: bytes) -> bytes:
return sha256(data).digest()
def _aes_ecb(key: bytes, plaintext: bytes) -> bytes:
return pyAES.new(key, pyAES.MODE_ECB).encrypt(plaintext)
rd = AES.test(_aes_ecb, None, AES.Mode.ECB, AES.KeyLength.AES128)
print(rd.keys())
rd |= SHA.test(_sha256, SHA.Algorithm.SHA_256)
print(rd.keys())
console = Console()
console.process_results(rd)
And I obtain:
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Results summary โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Primitives tested: AES, SHA โ
โ Valid tests: โ
โ Passed: 718 โ
โ Failed: 0 โ
โฐโโโโโโโโโโโโโโโโโโโโโโ crypto-condor 2024.08.03 by Quarkslab โโโโโโโโโโโโโโโโโโโโโโโฏ
I picked SHA-256 because it was the easiest, but the problem seems to come from the ECDSA module. To try to reproduce, it seems that you are testing signature verification with brainpoolP256r1 correct?
Also, I will have to check again but I feel like the classic report is not verbose regarding failing tests. [...] But in an other hand, it may be too verbose if many tests are failing.
Yes, by default only the global and per-test results are displayed. The reason was indeed that if many tests are failing, adding the failed tests is very verbose. But that was considering that you could easily and rapidly run the test again in debug mode, which is the case for the test cases based on Python libraries, but may not be the case with real targets. I'll think about adding failed cases by default.
Otherwise, the report format is super clear, I love the flag feature and the colorization to quickly see important information.
Glad to hear that :smile:
Fix pushed :) I'll push a new release on monday.
Awesome!
To try to reproduce, it seems that you are testing signature verification with brainpoolP256r1 correct?
This is correct. I tried merging results of ECDSA verify using brainpoolP256r1/sha256 and AES ECB 128.
Yes, by default only the global and per-test results are displayed. The reason was indeed that if many tests are failing, adding the failed tests is very verbose. But that was considering that you could easily and rapidly run the test again in debug mode, which is the case for the test cases based on Python libraries, but may not be the case with real targets. I'll think about adding failed cases by default.
Exactly, some tests can take hours when using embedded devices for example. But I still need some thinking about it. If I remember well, in interactive mode you can choose to save the debug results after seeing the number of failed tests. So it seems to be the best approach currently.
New release 2024.08.19 pushed :) as it fixes the issue, I'll close it and answer the other points separately.
This is correct. I tried merging results of ECDSA verify using brainpoolP256r1/sha256 and AES ECB 128.
I can't reproduce the issue with AES-ECB. I can see that the ResultsDict of ECDSA.test_verify
uses nist
and wycheproof
as keys, and the latter would conflict with Wycheproof tests for e.g. AES-GCM, but not with AES-ECB as there aren't Wycheproof vectors for ECB.
Running:
from crypto_condor.primitives import AES, ECDSA
from crypto_condor.primitives.common import Console
def _verify(public_key: bytes, message: bytes, signature: bytes) -> bool:
return ECDSA._verify(public_key, ECDSA.Hash.SHA_256, message, signature)
def _enc(key: bytes, plaintext: bytes) -> bytes:
return AES._encrypt(AES.Mode.ECB, key, plaintext)
def _dec(key: bytes, ciphertext: bytes) -> bytes:
return AES._decrypt(AES.Mode.ECB, key, ciphertext)
console = Console()
r1 = AES.test(_enc, _dec, AES.Mode.ECB, AES.KeyLength.AES128)
r2 = ECDSA.test_verify(
_verify, ECDSA.Curve.BRAINPOOLP256R1, ECDSA.Hash.SHA_256, ECDSA.PubKeyEncoding.DER
)
print("\n", f"{r1.keys() = }", "\n\n", f"{r2.keys() = }")
r1 |= r2
console.process_results(r1, None)
results in:
r1.keys() = dict_keys(['NIST/encrypt/ECB 128 VarTxt', 'NIST/encrypt/ECB 128 GFSBox', 'NIST/encrypt/ECB 128 KeySbox', 'NIST/encrypt/ECB 128 VarKey', 'NIST/encrypt/ECB 128 MMT', 'NIST/decrypt/ECB 128 VarTxt', 'NIST/decrypt/ECB 128 GFSBox', 'NIST/decrypt/ECB 128 KeySbox', 'NIST/decrypt/ECB 128 VarKey', 'NIST/decrypt/ECB 128 MMT'])
r2.keys() = dict_keys(['wycheproof'])
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Types of tests โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Valid tests : valid inputs that the implementation should use correctly. โ
โ Invalid tests : invalid inputs that the implementation should reject. โ
โ Acceptable tests: inputs for legacy cases or weak parameters. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Results summary โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ Primitives tested: AES, ECDSA โ
โ Valid tests: โ
โ Passed: 1326 โ
โ Failed: 0 โ
โ Invalid tests: โ
โ Passed: 239 โ
โ Failed: 0 โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ crypto-condor 2024.08.19 by Quarkslab โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
In any case, it is clear that the keys used to store Results have to be changed to ensure uniqueness. I'll use #8 to track that problem.
To conclude (I had to leave mid-comment :p)
Exactly, some tests can take hours when using embedded devices for example. But I still need some thinking about it. If I remember well, in interactive mode you can choose to save the debug results after seeing the number of failed tests. So it seems to be the best approach currently.
Yep, with the CLI you get asked whether to add the debug data to the report, either when using --save-to
or after being asked whether to save the results to a file.[^1] The difference with using increased verbosity is that the traceback is printed to the terminal, but is not saved to the report. This could be an improvement to pinpoint issues, at the cost of a bigger report, I'll keep it in mind.
Otherwise with the Python API you can process_results
with or without the debug data as long as the ResultsDict instance is around :)
[^1]: I am now realising that there is no CLI option to save a report with debug data, though you do get prompted when using --save-to
. Tracking this in #10.
Hi @JulioLoayzaM :wave:
I am encountering an issue while trying to use
Console().process_results
withfilename=None
.How to reproduce
Console().process_results
withfilename=None
Expected behavior
As the documentation says:
The user should not be prompted for saving the results to a file.
Current behavior
The user is still prompted to save the results to a file.
Possible cause
In
primitives/common.py:776
, the conditionbool(filename) or Confirm.ask("Save the results to a file?", default=False)
as the same behavior iffilename == ""
or iffilename == None
(both returnFalse
when cast to bool). If I replace the line by something likeif filename is not None and (bool(filename) or Confirm.ask("Save the results to a file?", default=False)):
, I get the expected behavior. Since I am unsure what is the expected behavior or the best way to fix it, I did not open a PR. But I can if you want me to :)