Closed lilyball closed 3 years ago
You're right, signed integers cannot be decoded for the moment. I'm planning to add this for the next release.
An ASN.1 integer is defined as a signed integer so I’m not sure why this library even allows for parsing it as an unsigned one (and defaults to that everywhere, such as when reading a certificate’s serial)
I encountered this while working on an alternative crate for ASN.1 parsing (asn1-rs, which may be interesting for you). Note: I intend to maintain both crates. Signed integers parsing is done in https://github.com/rusticata/asn1-rs/blob/master/src/asn1_types/integer.rs, and I think I'll copy the relevant parts in this crate.
The "classical" method is to encode it as the memory representation of a signed integer. This is not part of the specifications.
Negative numbers will start by 0xff
, which means an extra 0x00
has to be inserted before (MSB cannot be 1 in the first content byte).
For example, -1
can be encoded as 02 02 00 ff
.
The "classical" method is to encode it as the memory representation of a signed integer. This is not part of the specifications.
I'm not sure what you mean by this.
X.680 (ASN.1 notation) defines IntegerType such that its value is either a SignedNumber
or an identifier from the corresponding IntegerType
definition's named number lists (which are also signed numbers). Or in short, ASN.1 integers are signed. The older X.208 agrees.
X.690 (ASN.1 BER and DER encoding) says that the encoding of an integer value is primitive and the contents octets are the 2's complement encoding of the integer value.
Taken together, ASN.1 integers are always signed, and BER/DER encoding always encodes them as 2's complement. Every value tagged as an integer must be decoded as a signed integer, otherwise the decoding is wrong. The only reason to ever support decoding an unsigned integer is for use with application/private tags. Which is to say, the primitive for "turn this byte slice into an unsigned bignum" should exist, but it should not be used for decoding universal tags, only as a tool for use by implementations of custom tags. And it should also not even be exposed on an integer type as .as_biguint()
as this can only serve to reinterpret a signed integer as an unsigned one, thus producing incorrect results.
Also to note, from that asn1-rs/integer.rs link, the conversion to BigInt
is already wrong in exactly the same way I was complaining about in this ticket:
pub fn as_bigint(&self) -> BigInt {
BigInt::from_bytes_be(Sign::Plus, &self.data)
}
This is producing a BigInt
containing a positive number representation of the integer data interpreted as unsigned. Which is to say, it will produce 255
for 02 01 ff
.
Negative numbers will start by
0xff
, which means an extra0x00
has to be inserted before (MSB cannot be 1 in the first content byte).
That's backwards. 02 02 00 ff
encodes the positive integer 255. This is because without the leading 00
, the ff
is then interpreted as negative. So -1
is encoded as 02 01 ff
. This case already exists in the asn1-rs/integer.rs you linked, I255-BYTES
@lilyball You're right, I was completely confused. I rewrote integer parsing in e84d3a9852727735c60eff51a51a104a193c6b94, and also fixed bigint/biguint parsing in 74950d2bd76e18eb20f396267ac561bfd4ccde24
Examples of test vectors can be found in https://github.com/rusticata/der-parser/blob/74950d2bd76e18eb20f396267ac561bfd4ccde24/tests/ber_parser.rs#L142-L244 (for BER, and in der_parser.rs
for DER), and I believe they are now correct.
Can you confirm this? If this is fixed, I'll also backport for fix for branch 5.x
The as_u*
methods feel like a footgun. The only reason I can think of for them is specifically for as_u64()
, just to cover the case of decoding a positive number that doesn’t fit in i64
but does in u64
. I’d recommend removing the others and documenting specifically that this exists for this one case, though it annoys me that this means BerError::IntegerNegative
will still need to exist. Perhaps there’s another error that can be returned instead from this one particular method? Or maybe it’s better to just delete as_u64()
as well as it feels like a bit of an edge case to need to support numbers that fit in u64 but not i64 without using bignums. And people can always decode the byte slice themselves if they need that. This will cut down the amount of code you need, remove the decode_array_u*
methods, and remove BerError::IntegerNegative
.
I’m also confused as to why the as_i*
methods check for the high bit. Can’t you just zero-pad to the necessary length and use a single code path?
Same question with the bigint decoding. Why does it check high bit and call one of two different BigInt
constructors? I’m also not convinced as_biguint()
needs to exist, it doesn’t simplify decoding and anyone who wants a BigUint
can call as_bigint().to_biguint()
. The only reason I can think of to convert directly is if you wanted to implement the ToBigInt
/ToBigUint
traits but you aren’t doing that.
Thanks for your reply, your suggestions are good. I'll try to explain my reasoning behind the current choices:
The
as_u*
methods feel like a footgun. The only reason I can think of for them is specifically foras_u64()
, just to cover the case of decoding a positive number that doesn’t fit ini64
but does inu64
.
These methods are especially useful to describe the value you expect, not any generic value that could be. And indeed, it would put the burden of handling values close to the limit on the caller (with risks of errors).
This follows the general philosophy of this crate: provide both generic parsing methods (mostly useful when you don't know what the type or value are) and specific methods (great for writing a parser from the ASN.1 grammar: you only have to provide the specific combinators you expect).
IMHO, calling as_u64
is a convenient way to tell "I want an integer, checking that is is 1) positive 2) not greater than 2^64".
I’d recommend removing the others and documenting specifically that this exists for this one case, though it annoys me that this means
BerError::IntegerNegative
will still need to exist. Perhaps there’s another error that can be returned instead from this one particular method? Or maybe it’s better to just deleteas_u64()
as well as it feels like a bit of an edge case to need to support numbers that fit in u64 but not i64 without using bignums. And people can always decode the byte slice themselves if they need that. This will cut down the amount of code you need, remove thedecode_array_u*
methods, and removeBerError::IntegerNegative
.
I'd prefer to provide multiple methods (documenting their limitations), but leave the choice to the user. From my experience, this is required to cover many use cases:
as_bigint()
, but it requires a dependency on num-bigint
and can be much slower than other methods. der-parser
no supports no_std
, and aims at being usable also for low-resources platforms.i64
? i128
?). However, there are platforms where these types are not natively supported, and the compiler has to insert tons of code to handle them. IMHO, this is a small compromise between some code duplication and remaining efficient (without losing safety).Having multiple parsing methods is useful everywhere in crates where you know the ASN.1 grammar: it is way easier to write parse_ber_u32()
when reading a sequence item, rather than parse_ber_integer()
followed by as_bigint()
, then to_u64()
, and adding error checking code everywhere in the calling code.
I’m also confused as to why the
as_i*
methods check for the high bit. Can’t you just zero-pad to the necessary length and use a single code path?
No, because negative values are padded with 0xff
(2-complement of 0). Also, positive values must be checked to remove leading zeroes that could make the bytes longer than the buffer, but would not change the value.
Same question with the bigint decoding. Why does it check high bit and call one of two different
BigInt
constructors? I’m also not convincedas_biguint()
needs to exist, it doesn’t simplify decoding and anyone who wants aBigUint
can callas_bigint().to_biguint()
. The only reason I can think of to convert directly is if you wanted to implement theToBigInt
/ToBigUint
traits but you aren’t doing that.
The two constructors are here because originally, I was using BigInt::from_bytes_be
, but for some reason using Sign::Minus
did not work. This may not be correct, I'll check if from_signed_bytes_be
works for both sign/unsigned cases.
There is no strong reason for providing as_biguint()
. The only think I'm disliking about chaining as_bigint().to_biguint()
is that this may fail for various reasons (for ex. wrong tag) for the former, and to_biguint()
returns an Option
type, so error handling is not really consistent.
5.1.1
has been published and fixes the signed/unsigned problem, so I'm closing this.
Feel free to reopen or open another issue if you would like to discuss further. Thanks
BerObject
doesn't seem to have any way of actually parsing a negative ASN.1 integer as a negative number. It only exposes.as_u32()
and.as_u64()
, both of which simply assume the whole thing is unsigned (even though ASN.1 integers are signed and BER/DER-encoding represents them as two's complement integers). Because of the variable length of an integer I can't even cast the result to a signed value, as e.g.-123
encodes as[0x85]
which is then just133
(and therefore casting to a signed integer preserves the133
representation).The only nod towards signed integers is the existence of
.as_bigint()
in addition to.as_biguint()
. Unfortunately (and completely undocumented), the result is always a positiveBigInt
, as the implementation explicitly provides the positive sign instead of using the constructor that detects the sign from the bytes.Example: