Testcases failure - Githubissues

danielmarschall commented 2 months ago

(as previously mentioned in https://github.com/MHumm/DelphiEncryptionCompendium/issues/62 )

I am using Delphi 11, have compiled DEC60.exe to compile the DCU files and then immediately compiled and ran DECUnitTestSuite.exe

There are the following problems "out of the box":

Data\Keccak.rsp is missing. Did you forget to include this test file in GitHub? Therefore all tests in TestTHash_Keccak_224 fail with EFOpenError.

Other tests in TestTHash_Keccak_256, 384, 512 fail with ETestFailure. (Maybe this will automatically fixed if the RSP file is added?)

TestTDECGCM : TestEncodeSTreamChunked fails with:

TestEncodeStreamChunked: ETestFailure
at  $00344DD3
Authentication tag wrong for Key 9971071059abc009e4f2bd69869db338 IV 07a9a95ea3821e9c13c63251 PT f54bc3501fed4f6f6dfb5ea80106df0bd836e6826225b75c0222f6e859b35983 AAD Exp.:  Act.: , expected: <7870d9117f54811a346970f1de090c41> but was: <12dee761e7034c3fe993d7bfb3389aa9>

The vector is from gcmEncryptExtIV128.rsp

[Keylen = 128]
[IVlen = 96]
[PTlen = 256]
[AADlen = 0]
[Taglen = 128]

Count = 0
Key = 9971071059abc009e4f2bd69869db338
IV = 07a9a95ea3821e9c13c63251
PT = f54bc3501fed4f6f6dfb5ea80106df0bd836e6826225b75c0222f6e859b35983
AAD = 
CT = 0556c159f84ef36cb1602b4526b12009c775611bffb64dc0d9ca9297cd2c6a01
Tag = 7870d9117f54811a346970f1de090c41

I guess this is the same as https://github.com/MHumm/DelphiEncryptionCompendium/issues/52#issuecomment-1777647160

MHumm commented 2 months ago

Sorry about those. I know about them. This is an area where I need to find the time and patience/motivation for. And yes, I think I should handle this a bit better so that not so many issues in that area are in the public repository. At the time being the best you can do is to disable these Keccak tests. The algorithm as such passes the relevant testcases fine, just the Unicode convenience implementation doesn't pass the tests, but that is most likely a failure in the test implementation.

danielmarschall commented 2 months ago

I want to help you as good as possible with improving DEC. I am very happy that DEC is maintained all the years to work on the latest Delphi versions

danielmarschall commented 2 months ago

Feel free to give a specific task to me! I will look at the Keccak issues today.

MHumm commented 2 months ago

Thanks for offering help! I really appreciate that! If you like we can switch to using e-mail for conversation, in case that's more convenient. One of my mail addresses is the one listed as main contact in notice.txt.

MHumm commented 2 months ago

About this Keccak issue: Keccak is the original version of SHA3. When creating the SHA3 standard, NIST as added some bits to the unhashed message to identify the variant of SHA3 algorithm (SHA3 vs. SHake etc.). Now some libraries had implemented Keccak and didn't notice the difference to SHA3 and named their implementation SHA3. That's wy Keccak got added to DEC after adding SHA3. The problem now are the Unicode string based test cases. My idea for investigation how to fix them is to create a file with a few of the SHA3 testcases (as I want the Keccak tests to use the unmodified SHA3 test data) for ressearch and use 2 IDE sessions: one runs the Unicode string based test and the other session runs one of the working tests and then one compares where the failure/deviation happens and that's the starting point for fixing the Unicode string test code. Does this sound plausible? Would that be a task? (of course I can provide a more detailed description if necessary)

danielmarschall commented 2 months ago

Currently, there are a few things that I am confused about. I hope I don't annoy you with my many questions :-)

(1) First, I am a bit confused why you think it is an Unicode problem.

Inside all TestTHash_Keccak_* testcases, all TestCalc... testcases are failing. Stream, RawByteString, Buffer, Bytes, .... A test case working with TBytes should have nothing to do with Unicode? I rather think that the hashing is wrong, or the test vector is wrong.

(2) I didn't quite understand what you said about SHA3 and Keccak being different. At least in the test cases TestTHash_Keccak_256.SetUp and TestTHash_SHA3_256.SetUp, they got the same test vector SHA3-256_Msg5.pdf which is H(0b11001) = 0x7b0047....

If SHA3(0b11001) = 0x7b0047... and Keccak(0b11001) = 0x7b0047... are both expected to be true, aren't they the same hash algorithm then? Are you sure the test vectors are supposed to be the same?

(3) Next confusion is, how the test cases are written; maybe you can help me understanding.

  // Source https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Standards-
  //        and-Guidelines/documents/examples/SHA3-256_Msg5.pdf
  lDataRow := FTestData.AddRow;
  lDataRow.ExpectedOutput           := '7b0047cf5a456882363cbf0fb05322cf65f4b705' +
                                       '9a46365e830132e3b5d957af';
  AddLastByteForCodeTest(lDataRow, #$13, 5);

The PDF writes that the test message 0b11001 has the hash 7b0047..., Ok. Last byte of the message contains 5 bits, Ok. But why is there #$13 and not #$19 (0b00011001) or #$C8 (0b11001000) ?

And in AddLastByteForKeccakTest, the #$13 becomes #$13#2 which is even more confusing. I expected that the input vector is 0b11001... I tried to force the debugger to set MsgWithFixup to #$19, but it didn't work.

If I remove the first vector, then I see that the second vector also fails:

  // Source https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Standards-
  //        and-Guidelines/documents/examples/SHA3-256_Msg30.pdf
  // TODO:
  // expected: <c8242fef409e5ae9d1f1c857ae4dc624b92b19809f62aa8c07411c54a078b1d0>
  // but was:  <ce8c2109c14e5416785a205f34316b50fa11993fac9c7236c643cb5e7b00afbd>
  lDataRow := FTestData.AddRow;
  lDataRow.ExpectedOutput           := 'c8242fef409e5ae9d1f1c857ae4dc624b92b1980' +
                                       '9f62aa8c07411c54a078b1d0';
  AddLastByteForCodeTest(lDataRow, #$53#$58#$7B#$19, 30 mod 8);

Here, the test vector 0b 1100 1010 0001 1010 1101 1110 1001 10 should have the hash c824... , Ok. The last byte has 6 bits (30 mod 8), Ok. But also here, I don't understand why there is written #$53#$58#$7B#$19 . The test vector 0b 1100 1010 0001 1010 1101 1110 1001 10 is 0xCA1ADE9_ not 53587B19 ??

(4) For the Keccak224 testcase, I see that the SHA3 test vector files SHA3_224*.rsp are commented out and replaced by Keccak.rsp. Did you accidentally change this during development to test something and forgot to change it back, or do you intentionally want a different test vector in there, which you forgot to upload?

  //Source https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Algorithm-
  //       Validation-Program/documents/sha3/sha-3bittestvectors.zip
  FTestFileNames.Add('..\..\Unit Tests\Data\Keccak.rsp');
//  FTestFileNames.Add('..\..\Unit Tests\Data\SHA3_224ShortMsg.rsp');
//  FTestFileNames.Add('..\..\Unit Tests\Data\SHA3_224LongMsg.rsp');
  // SourceEnd

Thank you very much for your help!

In re email communication: I think we can do that for sure. However, to share screenshots and formatted source code snippets, GitHub has better options in the visualization. It would just be more comfortable to keep talking in German :-)

MHumm commented 2 months ago

Ok, we can switch to e-mail if you send me one and then we can of course switch to german ;-) Until then a few remarks:

Don't worry about questions. I'll try to answer them as far as I can.
I'm not the original developer of DEC, I have taken it over once so I don't know everything.
I'd like to focus on one topic at a time, in this case Keccak.
I'm still convinced that most Keccak tests worked in the past. if they currently don't this is a consequence of the attempt to fix the Unicode string ones.
The difference between keccak and SHA3 is, that the latter one adds two bits to the message to be hashed before hashing.
So my idea for the tests was, that the test algorithm is formulated in such a fashion, that the same test data can be used for Keccak as well by having a virtual method in the test which the Keccak variant overwrites with a version which adds those 2 bits and the SHA3 one simply contains an empty implementation or one just returning the unaltered message.
My guess is, that this method needs to look a bit different for Unicode strings, but how has not been properly found out yet. :-(

danielmarschall commented 2 months ago

I found this issue! In TestTHash_SHA3_Base.AddLastByteForCodeTest , the uninitialized (and unnecessary?) variable LastByteLen will be used instead of the argument LastByteLength. => Fixed in 2e8d2ec58e1fd94362ded2afb216086f34f271f8

Now, all Keccak testcases except TestCalcUnicodeString work. See reply below for the analysis.

danielmarschall commented 2 months ago

Analysis for TestCalcUnicodeString :

TestTHash_Keccak_Base.AddLastByteForKeccakTest is called at TestTHash_SHA3_Base.LoadTestDataFile in order to add the Keccak padding to make the testcases compatible with the official SHA-3 vectors. The TODO entry Problem: here the method from the base class is called instead the overwritten one from Keccack... is false. Since the method is virtual, the correct method in TestTHash_Keccak_Base is called.
So, for example, '' has been changed to #$2
Everything works, for example CalcBytes works with Keccak($#2) = SHA3('') = 6B4E03423667DBB73B6E15454F0EB1ABD4597F9A1B078E3F5B5A6BC7
So far so good...
But for CalcString(string) we have a big problem! The WideString-overload coverts the string into multiple of 2 bytes. For example $#2 becomes $#2$#0. But this destroys our testcase. We WANT $#2 in order to re-use the SHA-3 test vector, but CalcString does not allow this.
So, I think the testcase TestCalcUnicodeString is IMPOSSIBLE if we want to keep the SHA3 test vectors.

MHumm / DelphiEncryptionCompendium

Testcases failure #65