Open danielmarschall opened 6 months ago
Sorry about those. I know about them. This is an area where I need to find the time and patience/motivation for. And yes, I think I should handle this a bit better so that not so many issues in that area are in the public repository. At the time being the best you can do is to disable these Keccak tests. The algorithm as such passes the relevant testcases fine, just the Unicode convenience implementation doesn't pass the tests, but that is most likely a failure in the test implementation.
I want to help you as good as possible with improving DEC. I am very happy that DEC is maintained all the years to work on the latest Delphi versions
Feel free to give a specific task to me! I will look at the Keccak issues today.
Thanks for offering help! I really appreciate that! If you like we can switch to using e-mail for conversation, in case that's more convenient. One of my mail addresses is the one listed as main contact in notice.txt.
About this Keccak issue: Keccak is the original version of SHA3. When creating the SHA3 standard, NIST as added some bits to the unhashed message to identify the variant of SHA3 algorithm (SHA3 vs. SHake etc.). Now some libraries had implemented Keccak and didn't notice the difference to SHA3 and named their implementation SHA3. That's wy Keccak got added to DEC after adding SHA3. The problem now are the Unicode string based test cases. My idea for investigation how to fix them is to create a file with a few of the SHA3 testcases (as I want the Keccak tests to use the unmodified SHA3 test data) for ressearch and use 2 IDE sessions: one runs the Unicode string based test and the other session runs one of the working tests and then one compares where the failure/deviation happens and that's the starting point for fixing the Unicode string test code. Does this sound plausible? Would that be a task? (of course I can provide a more detailed description if necessary)
Currently, there are a few things that I am confused about. I hope I don't annoy you with my many questions :-)
(1) First, I am a bit confused why you think it is an Unicode problem.
Inside all TestTHash_Keccak_*
testcases, all TestCalc...
testcases are failing. Stream, RawByteString, Buffer, Bytes, ...
. A test case working with TBytes
should have nothing to do with Unicode? I rather think that the hashing is wrong, or the test vector is wrong.
(2) I didn't quite understand what you said about SHA3 and Keccak being different. At least in the test cases TestTHash_Keccak_256.SetUp
and TestTHash_SHA3_256.SetUp
, they got the same test vector SHA3-256_Msg5.pdf
which is H(0b11001) = 0x7b0047...
.
If SHA3(0b11001) = 0x7b0047...
and Keccak(0b11001) = 0x7b0047...
are both expected to be true, aren't they the same hash algorithm then? Are you sure the test vectors are supposed to be the same?
(3) Next confusion is, how the test cases are written; maybe you can help me understanding.
// Source https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Standards-
// and-Guidelines/documents/examples/SHA3-256_Msg5.pdf
lDataRow := FTestData.AddRow;
lDataRow.ExpectedOutput := '7b0047cf5a456882363cbf0fb05322cf65f4b705' +
'9a46365e830132e3b5d957af';
AddLastByteForCodeTest(lDataRow, #$13, 5);
The PDF writes that the test message 0b11001
has the hash 7b0047...
, Ok.
Last byte of the message contains 5 bits, Ok.
But why is there #$13
and not #$19
(0b00011001
) or #$C8
(0b11001000
) ?
And in AddLastByteForKeccakTest
, the #$13
becomes #$13#2
which is even more confusing. I expected that the input vector is 0b11001
... I tried to force the debugger to set MsgWithFixup
to #$19
, but it didn't work.
If I remove the first vector, then I see that the second vector also fails:
// Source https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Standards-
// and-Guidelines/documents/examples/SHA3-256_Msg30.pdf
// TODO:
// expected: <c8242fef409e5ae9d1f1c857ae4dc624b92b19809f62aa8c07411c54a078b1d0>
// but was: <ce8c2109c14e5416785a205f34316b50fa11993fac9c7236c643cb5e7b00afbd>
lDataRow := FTestData.AddRow;
lDataRow.ExpectedOutput := 'c8242fef409e5ae9d1f1c857ae4dc624b92b1980' +
'9f62aa8c07411c54a078b1d0';
AddLastByteForCodeTest(lDataRow, #$53#$58#$7B#$19, 30 mod 8);
Here, the test vector 0b 1100 1010 0001 1010 1101 1110 1001 10
should have the hash c824...
, Ok. The last byte has 6 bits (30 mod 8), Ok. But also here, I don't understand why there is written #$53#$58#$7B#$19
. The test vector 0b 1100 1010 0001 1010 1101 1110 1001 10
is 0xCA1ADE9_
not 53587B19
??
(4) For the Keccak224 testcase, I see that the SHA3 test vector files SHA3_224*.rsp
are commented out and replaced by Keccak.rsp
. Did you accidentally change this during development to test something and forgot to change it back, or do you intentionally want a different test vector in there, which you forgot to upload?
//Source https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Algorithm-
// Validation-Program/documents/sha3/sha-3bittestvectors.zip
FTestFileNames.Add('..\..\Unit Tests\Data\Keccak.rsp');
// FTestFileNames.Add('..\..\Unit Tests\Data\SHA3_224ShortMsg.rsp');
// FTestFileNames.Add('..\..\Unit Tests\Data\SHA3_224LongMsg.rsp');
// SourceEnd
Thank you very much for your help!
In re email communication: I think we can do that for sure. However, to share screenshots and formatted source code snippets, GitHub has better options in the visualization. It would just be more comfortable to keep talking in German :-)
Ok, we can switch to e-mail if you send me one and then we can of course switch to german ;-) Until then a few remarks:
I found this issue!
In TestTHash_SHA3_Base.AddLastByteForCodeTest
, the uninitialized (and unnecessary?) variable LastByteLen
will be used instead of the argument LastByteLength
. => Fixed in 2e8d2ec58e1fd94362ded2afb216086f34f271f8
Now, all Keccak testcases except TestCalcUnicodeString
work. See reply below for the analysis.
Analysis for TestCalcUnicodeString :
TestTHash_Keccak_Base.AddLastByteForKeccakTest
is called at TestTHash_SHA3_Base.LoadTestDataFile
in order to add the Keccak padding to make the testcases compatible with the official SHA-3 vectors. The TODO entry Problem: here the method from the base class is called instead the overwritten one from Keccack...
is false. Since the method is virtual
, the correct method in TestTHash_Keccak_Base
is called.''
has been changed to #$2
CalcBytes
works with Keccak($#2) = SHA3('') = 6B4E03423667DBB73B6E15454F0EB1ABD4597F9A1B078E3F5B5A6BC7
CalcString(string)
we have a big problem! The WideString-overload coverts the string into multiple of 2 bytes. For example $#2
becomes $#2$#0
. But this destroys our testcase. We WANT $#2
in order to re-use the SHA-3 test vector, but CalcString
does not allow this.TestCalcUnicodeString
is IMPOSSIBLE if we want to keep the SHA3 test vectors.
(as previously mentioned in https://github.com/MHumm/DelphiEncryptionCompendium/issues/62 )
I am using Delphi 11, have compiled DEC60.exe to compile the DCU files and then immediately compiled and ran DECUnitTestSuite.exe
There are the following problems "out of the box":
The vector is from gcmEncryptExtIV128.rsp
I guess this is the same as https://github.com/MHumm/DelphiEncryptionCompendium/issues/52#issuecomment-1777647160