MHumm / DelphiEncryptionCompendium

Cryptographic library for Embarcadero Delphi and potentially for FPC as well
Apache License 2.0
255 stars 66 forks source link

Migration from old DEC library #14

Closed jaclas closed 3 years ago

jaclas commented 3 years ago

I am trying to migrate from the archaic DEC version to the current one. I have written my own wrapper for the old version of the library and I use this wrapper in my projects. But I see very many changes in the new DEC, I have dozens of compilation errors. Please give me some words how to approach this migration.

I noticed for example that the new DEC is focused on using TBytes type, am I right? What else to pay attention to?

MHumm commented 3 years ago

While it is true that TBytes is the preferred way if you don't use a stream the other string based ways etc. are still available for backwards compatibility. I think most classes provide even different string types like UnicodeString and AnsiString, depending on platform availability. Remember that DEC is cross platform capable now.

To make the API more user friendly and maintainable etc. we did some restructuring of the whole library but all former functionality should still be there and it shouldn't be overly hard to adapt. If using the version from development branch that already has some command line app to add the source directory to Delphi's library path etc. The development branch is already a little bit further then master.

Since I don't know which compilation errors you got I cannot be more helpful than that. Feel free to post further details. Another note: between DEC 6.0 and the current development branch state we redesigned the progress event handling if you should have used that. It's way easier and more flexible now. But I guess that event is only relevant when used with files or similar large contents.

jaclas commented 3 years ago

Thank you for your reply. For now I am converting my wrapper to the new DEC version, I see some places in the DEC that I would improve or slightly expand. Once I finish the migration, I'll make suggestions.

For now, however, I think pointers are more elatic and convenient than TBytes. It is very easy to extract a pointer from a string, stream or TArray. In contrast, making TBytes (for input data) from a stream or pointer to an allocated area of memory without copying data is often impossible.

jaclas commented 3 years ago

I'm almost done migrating my wrapper to the new DEC version. There are still errors left with the Sapphire algorithm, I don't know what this is due to, I'll keep looking.

On the screenshot unit tests of my wrapper obraz

jaclas commented 3 years ago

I think I may have found a bug in the TCipher_Sapphire class. The strange thing is that the class in its current form can encode and decode data correctly, but in my tests I also have ready-made cryptograms stored (created with the old version of the library) and trying to decrypt them fails. After tweaking the code, the decryption works correctly.

The method currently look like this:

procedure TCipher_Sapphire.DoEncode(Source, Dest: Pointer; Size: Integer);
var
  T: UInt32;
  I: Integer;
  SKey: TSapphireKey;
begin
  SKey := PSapphireKey(FAdditionalBuffer)^;
  for I := 0 to Size - 1 do
  begin
    SKey.Ratchet := (SKey.Ratchet + SKey.Cards[SKey.Rotor]) and $FF;
    SKey.Rotor := (SKey.Rotor + 1) and $FF;
    T := SKey.Cards[SKey.Cipher];
    SKey.Cards[SKey.Cipher]  := SKey.Cards[SKey.Ratchet];
    SKey.Cards[SKey.Ratchet] := SKey.Cards[SKey.Plain];
    SKey.Cards[SKey.Plain]   := SKey.Cards[SKey.Rotor];
    SKey.Cards[SKey.Rotor]   := T;
    SKey.Avalanche := (SKey.Avalanche + SKey.Cards[T]) and $FF;
    T := (SKey.Cards[SKey.Plain] + SKey.Cards[SKey.Cipher] + SKey.Cards[SKey.Avalanche]) and $FF;
    SKey.Plain := PByteArray(Source)[I];
    SKey.Cipher := SKey.Plain xor SKey.Cards[SKey.Cards[T]] xor
                   SKey.Cards[(SKey.Cards[SKey.Ratchet] +
                   SKey.Cards[SKey.Rotor]) and $FF];
    PByteArray(Dest)[I] := SKey.Cipher;
  end;
end;

In my opinion it is a mistake to use local record TSapphireKey, the beginning of the method should look like this:

procedure TCipher_Sapphire.DoEncode(Source, Dest: Pointer; Size: Integer);
var
  T: UInt32;
  I: Integer;
  SKey : PSapphireKey;  <------------
begin
  SKey := PSapphireKey(FAdditionalBuffer);   <--------------
  [...]

The same problem applies to the DoDecode method:

procedure TCipher_Sapphire.DoDecode(Source, Dest: Pointer; Size: Integer);
var
  T: UInt32;
  I: Integer;
  SKey: TSapphireKey; <------ 
begin
  SKey := PSapphireKey(FAdditionalBuffer)^;  <--------

Using a local variable with a record results in the algorithm not modifying the source data correctly, but processing the local copy and losing the data when it exits these methods. By doing this in the encode and decode method the result seems correct.

MHumm commented 3 years ago

Ok, I included your fix and added you to the notice.txt. If you have different test data you use than the one included in the unit tests I'd like to have that if that's possible. I can expand the unit test then.

jaclas commented 3 years ago

My unit tests are not suitable for you because my wrapper also writes IV and MAC in the result. I am currently working on changing this code because I don't like the way it works.

But I am writing regarding the unit tests. I suggest you use another language like PHP, Python, JS or C# to generate ciphertexts in different encryption algorithms. You can use these ciphertexts for unit testing in DEC.

For example, AES 256/CBC in Python with pyaes library:


import pyaes, binascii
key = b'01234567012345670123456701234567'
plaintext = 'Some short description with looooooooong additional data like polish diacritical chars... łóżźćęół and digits 0123456789'
# IV is set to 16 zeroes
encrypter = pyaes.Encrypter(pyaes.AESModeOfOperationCBC(key))
ciphertext = encrypter.feed(plaintext.encode('utf_8'))
ciphertext += encrypter.feed()
print('Encrypted AES256:', binascii.hexlify(ciphertext).upper())   

results:

Encrypted AES256: B623A479FC657E31F219287CD191075575B2FB56485D0C22E9168A2BF2289C7165CDA67586A486E14115C754ABA158A84A8C3B521E0DF87505D77649A8F1CB52A03D41E205849F28BCA2DE189A9C65CDB648DBC9F7D49AF2F1704B491E9E2DE6FC357ADC8E15733394C3C75B45570AE77A2A6CB6CC4418A558A78313C0C16478A7D61538B88B486BCAE89235D8FCEEB8

ps. Now I try use this library for extended tests:

https://cryptography.io/en/latest/hazmat/primitives/symmetric-encryption.html#algorithms

MHumm commented 3 years ago

Problem is: time and knowledge is limited... ...so I welcome any useful contributions.

jaclas commented 3 years ago

I'm working all day on python program to generate test data, I still get errors, but I get sucess too ;-)

https://pastebin.com/raw/yjDL3LEc

When I finish, I will write a generator for creating *.pas files with data for unit tests, if you have a proposal how such a file should look like, please make a template.

ps. By the way, I'm learning python ;-)

jaclas commented 3 years ago

A few days I was working on generating test data for several cryptographic algorithms. I used two popular and developing libraries from Python: Cryptography and PyCryptodome. Here are the results of the work so far:

https://gist.github.com/jaclas/96a3c89626f84747a31676a65028c59b

(I saved all data as JSON)

I can share the source code in python as well. Unfortunately these are not all the algorithms from DEC, but I will slowly look for more libraries. By the way, I was very surprised that there are so few good and developed cryptographic libraries in Python. Now I will create test data for hashing algorithms.

jaclas commented 3 years ago

According to my tests it seems that the AES class (Rijndael) does not work properly, it returns different results on the same data than two (different) Python libraries

MHumm commented 3 years ago

Thanks for testing in the first place, but are you really sure that all parameters used are the same? e.g. in the code above I see you used strings with non ASCII chars in them. There might be differences in string encoding. Can you provide failing test data and if yes in such a way, that the input is in hex? So there cannot be any string encoding mistake. The unit test implemented for AES in DEC runs fine and uses the test data the original author of the library already used. So either he was wrong from the start or you have discrepancies in the way you set up tests for DEC and for Python.

jaclas commented 3 years ago

I already provided the test data, you have a link above from gist.github.com - the test data is there. And that data contains only ascii in the input (field "plaintext"), check it yourself. On example:

    "AES": {
        "CBC": {
            "128": {
                "iv_hex": "30313233343536373839303132333435",
                "key_hex": "30313233343536373839303132333435",
                "plaintext": "abcdefghijklmnopqrstuv0123456789",
                "result_hex": "8859653CB4C4E4CA3ADD490015AC8860FA59D1E233301563B184FCCA95790C8C"

properly result is: 8859653CB4C4E4CA3ADD490015AC8860FA59D1E233301563B184FCCA95790C8C, but from DEC I get: 8859653CB4C4E4CA3ADD490015AC8860F77706676AFB9C6CEE50F207699BBE63

ps. "hex" in field names mean hexadecimal coding the data

jaclas commented 3 years ago

Try this in command line:

openssl enc -aes-128-cbc -nopad -in test.txt -out test.enc -K $"30313233343536373839303132333435" -iv $"30313233343536373839303132333435"

In test.txt paste ansi string: abcdefghijklmnopqrstuv0123456789

jaclas commented 3 years ago

here is a link that you may find useful https://cryptography.fandom.com/wiki/AES_implementations#Delphi

jaclas commented 3 years ago

And maybe this: http://www.ciuly.com/delphi/basic-application-security/cryptography/aes/

MHumm commented 3 years ago

I found the original AES documentation from NIST meanwhile, which contains the original test vectors. I'll try to add them tho the unit tests as soon as time permits and begin investigating from those. The other issue is the way DEC implemented AES: put AES 128, 192 and 256 into the same class just selecting the parameters by the length of the key given.

I'm toying with the idea to add AES128, AES192 and AES256 as classes inheriting from the AES class itself to make it obvious which variant is meant. But I'd like to keep the original mechanism as well for compatibility. I don't like ruining existing code of existing users.

jaclas commented 3 years ago

In old DEC there was no AES class, only Rijndael. And AES class was only mentioned as a potential candidate:

  TCipher_Twofish      = class;  {AES Round 2 Final Candidate}
  TCipher_IDEA         = class;
  TCipher_Cast256      = class;
  TCipher_Mars         = class;  {AES Round 2 Final Candidate}
  TCipher_RC4          = class;  {Streamcipher in as Block Cipher}
  TCipher_RC6          = class;  {AES Round 2 Final Candidate}
  TCipher_Rijndael     = class;  {AES Round 2 Final Candidate}

I think you can safely leave the TRijndael class, and remove the TAES class and write in the documentation that it was wrong (every user of the current TAES class will fix the code to TRijndael when they get a compilation error). There is no sense in keeping a garbage class in the code, it only creates confusion and chaos. This is my opinion :-)

ps. Of course, adding a corrected and valid AES is needed and reasonable, it could be in the form of AES128, AES192, etc.... as you wanted.

MHumm commented 3 years ago

If I understand you correctly there is a difference between Rijndael and the variant specified as AES by nist? If so I should provide a fixed AES implementation and would ask what the difference between those two are. I thought Rijndael is just the old name of AES...

jaclas commented 3 years ago

No, I don't know what the difference between Rijndael and the NIST AES definition is, but I do know that the current implementation of Rijndael gives different results than a valid AES algorithm should give, so either there's a bug, or it's some early version of Rijndael that was being prepared for an algorithm competition and was later still modified. But this is just my speculation. My point is just to leave this flawed/suspicious version of Rijandael as it is, then those who already use it in their code will keep the compatibility, but the corrected/new implementations should already be named according to the AES name (like AES128)

ps. for some reason the original DEC author didn't publish the Rijndael algorithm as AES, did he?

MHumm commented 3 years ago

Can you please check newest commit of development branch? I added 4 more test cases to the AES unit tests: three of them are from the original NIST FIPS 197 AES standard specification and the forth is the one you provided using CBC block chaining. All of them pass! Small remark: your test gives 8859653CB4C4E4CA3ADD490015AC8860FA59D1E233301563B184FCCA95790C8C as output. In order for the unit tests to function I had to change the upper case letters to lower case.

Can you please check situation again? Given my test results I cannot see any fault in DEC.

MHumm commented 3 years ago

ps. for some reason the original DEC author didn't publish the Rijndael algorithm as AES, did he?

I simply suspect that at the time he implemented it it was still a candidate and then he lost interest in this fine library never updating this. The next maintainer (not me) had other priorities with it: make it Unicode compatible and prepare it for cross platform compatibility. He never released a version with these changes but he found me who finished his work and that's where we are today. With the somewhat limited time and knowledge I finally managed to release it and I'll try to fix bugs reported. Adding further algorithms is the plan as well, but that will take some time...

jaclas commented 3 years ago

At first glance, I see that the NIST tests and yours assume an empty (zero-filled) IV vector. The tests I did using the Python libraries had a fixed, non-zero IV vector. I also checked the first NIST test case with what the openssl library calculates, its also given a vector filled with zeros and the results agree. However, DEC when the IV vector was set to a value other than zeros gave different results than openssl and Python, perhaps there is a problem with the IV vector handling? I'll do the Python tests again tomorrow, but with a zeroes vector and compare the results with DEC.

ps. imho IV should not be in serious use filled with zeros

jaclas commented 3 years ago

I'm doing some deeper testing and it looks like I made a mistake in testing the AES class with DEC. I don't know what I got wrong yet. But currently the results are consistent with what python and openssl produce.

jaclas commented 3 years ago

Again, python data in JSON file:

https://gist.github.com/jaclas/96a3c89626f84747a31676a65028c59b

and delphi console app to test DEC with above JSON input:

https://gist.github.com/jaclas/287562d49b455f39c097bb1e96da9f4d

Look for this please...

jaclas commented 3 years ago

There is something fishy about CFB mode. All the tests written in DEC that I pasted above when running in CFB mode use the cmCFB8 version. Unfortunately IDEA test in CFB mode returns error. But I noticed that when for just this test you change the mode to cmCFBx then the error disappears and the result is correct. This is worth noting, because all the tests from the Cryptography library from Python, whose results I put in JSON, used CFB mode (and not CFB8).

obraz

That is, the results are mostly consistent even though DEC uses CFB8 mode and Cryptography uses CFB mode.

To be clear, this test data comes from two libraries: Cryptography - AES, ARC4, IDEA, TripleDES Cryptodome - DES, CAST, ARC2, Blowfish

jaclas commented 3 years ago

I did some more AES algorithm tests comparing DEC and SynCrypto (from mORMot) results and all the results agree, and I trust SynCrypto a lot because the author is an amazing software madman in a positive sense :-)

MHumm commented 3 years ago

I guess I may close this one now?

MHumm commented 3 years ago

"When I finish, I will write a generator for creating *.pas files with data for unit tests, if you have a proposal how such a file should look like, please make a template."

If you still aim at such a generator, then the out put should be in the same format/structure the DEC DUnit unit tests use. I guess that should be doable so we could mostly copy and paste the output into those and enhance those. I'd value any contribution there as it looks like that some of the test vectors currently used are flawed. See the discussion in #21. But test vectors should stem from "official sources" as far as possible.