wbond / asn1crypto

Python ASN.1 library with a focus on performance and a pythonic API
MIT License
335 stars 140 forks source link

Bug in commit 'Handle BER-encoded indefinite length values better' #249

Open vasvlad opened 1 year ago

vasvlad commented 1 year ago

I've found the bug in this commit: link to commit c29117fd57deb80fb345cf76cad9d0d48e8bbf17 Definition of length of asn1 package in this part of code "self._header[-1:] == b'\x80':" is incorrect. According this documentation: https://www.w3.org/Protocols/HTTP-NG/asn1.html For the definite form, if the length is less than 128, you just use a single byte, with the high bit set to zero. Otherwise the high bit is set to one, and the low seven bits set to the length of length. The length is then encoded in that many bytes.

More correct code is:


if self._header is not None and  self._header[1]<128 and  self._header[-1:] == b'\x80':

    force = True
zito commented 1 year ago

Today I discovered exactly the problem reported here. I am not certain the proposed fix is correct. Looking at BER length encoding seems to me that the indefinite length encoding has the only length octet 0x80, but tag can be encoded on more octets, so the length octet can be on offset 1 or more... So the condition self._header[-1] == b'\x80' is basicaly correct, but must be completed by test, that preceding octet is not higher length octet...

wbond commented 1 year ago

Can you provide an example of an encoded ASN.1 value that exposes this bug?

From my understanding the header can’t end in 0x80 for the DER encoding. See the high bit is set, but then the length is set to 0 since all seven lower bits are 0. This switches to indefinite length mode, which is used when streaming a value where the encoder doesn’t know the total length. As a result the decoder has to look for indefinite chunks. This by definition isn’t DER encoding since there can only be one canonical DER encoding. Instead this implied a BER encoding.

This is why I’d like an example so I can see what is actually going on.

zito commented 1 year ago

I mentioned the example in bug #195 already. See my last comments there. The certificate in the attached zip file... The highest bit you mentioned - this is about the short form (length up to 127). There can be a number of length octets in header for a length over 127. The last octet can be 0x80 quite easily.

wbond commented 1 year ago

Posting an example on a different issue isn’t as helpful as posting it on the issue about the bug.

My recollection is fuzzy, but I thought the high bit was not set on any of the length bits? Or are you saying it is only not set on a single-byte-encoded length?

wbond commented 1 year ago

Yeah, so that was most likely the source of the bug in the implementation. https://luca.ntop.org/Teaching/Appunti/asn1.html Confirms the high bit only matters on the first byte.

vasvlad commented 1 year ago

For example I have the problem in this structure. cms.zip The second item have this header: SEQUENCE (5 elem) - 30 82 0A 80 And last byte is 80, but this byte is last byte of length - 0x0A 0x80 of sequence.

zito commented 1 year ago

I am afraid that test can't be easily written in only one if statement. Identification/tag can occupy more than the first octet of the header. So self._header[1] is not necessarily the first length octet...