bcgov / itvr

Apache License 2.0
2 stars 8 forks source link

ITVR - parsing decrypted file misses records #702

Closed tim738745 closed 1 day ago

tim738745 commented 1 month ago

Describe the Bug The decrypt() method in DecryptService of the spring app doesn't parse a decrypted file correctly - the decryption itself is successful, but not all the records are retrieved from the decrypted file.

Expected Behaviour All records should be retrieved from the decrypted file.

Actual Behaviour Not all records are retrieved from the decrypted file.

Implications Will have to continue using the Entrust wizard for decryption and console users will have to upload the decrypted file into ITVR.

Steps To Reproduce Steps to reproduce the behaviour: User/Role: console user

  1. Get encrypted CRA return file via FTP
  2. Upload said file
  3. See that not all records sent to the CRA have a corresponding return record (this seems to only happen if >5 records are sent to the CRA; <= 5 may be okay).
tim738745 commented 1 month ago

After decrypting the CRA return file, we see that the content we need is compressed and encoded as a constructed octet string (see https://letsencrypt.org/docs/a-warm-welcome-to-asn1-and-der/). However, Entrust’s Java API (v8.1.20), can’t seem to decode + decompress this structure successfully. The API expects the constructed octet string to be a sequence of consecutive primitive octet strings, but, in reality, it is a sequence of primitive octet strings separated by 8-byte separators.

This was not apparent in testing because we tested with data consisting of only a few records, and the decoding + decompression API methods we used successfully decoded and decompressed the first primitive octet string (which was able to contain all of our test data). Of course, in production, there was more data and the decoding + decompression fails after encountering the 8-byte separator mentioned above.

To address this, I tried ignoring the 8 byte separators and passing all the primitive octet strings to the decompressor, but the result was not correct. I also tried passing the content along with the 8 byte separators, but the result was also not correct.

What is more mysterious to me is that the decompressor only seems to decompress the first part of the first primitive octet string correctly; since we know that the compression was done using the DEFLATE algorithm (see https://www.ietf.org/rfc/rfc1951.txt), maybe it means only the literal byte Huffman codes were parsed correctly, and the distance codes are missing? Perhaps that’s what those 8 byte separators are?

I will contact the CRA and see if they can put me in touch with someone who can help me with this part of the API (the decoding + decompressing).

shayjeff commented 1 month ago

CRA has opened a ticket to their IT team, but it can take a few days before they get a change to reach out to. This card will remain blocked until then.

tim738745 commented 1 month ago

Response from CRA IT support: "Unfortunately, we do not support Java Toolkit. Thanks for your attention and understanding."

tim738745 commented 1 month ago

Received another message from the CRA offering to help; will follow up!