Closed luavixen closed 4 years ago
For most uses, it is expected that overflows are abnormal and if they occur then that is an indication that the input data is corrupted and cannot be reliably read anyway. That is, if the corruption caused an overflow, how can you be sure where the LEB128 value really ends? If your TCP sender is deliberately producing values that overflow, then it might be better to change that behaviour instead.
I don't think your DoS concern is a valid reason to change this. If an attacker can cause this situation to occur, then they could just as easily cause a situation where there are more numbers than you expect. You should be designing your application to handle this anyway.
The error handling could be changed to keep reading bytes until the continuation bit is clear, but I'm unsure what the behaviour should be if a read error then occurs while doing that.
You should be designing your application to handle this anyway.
You're right! I'm currently trying to un-f**k a webservice that someone wrote and its filled with bugs and hacks, so better designed applications will hopefully fail much more gracefully.
For most uses, it is expected that overflows are abnormal and if they occur then that is an indication that the input data is corrupted and cannot be reliably read anyway.
My application is receiving values from a Node.js service (which is receiving it's value from the client's browser), and on the JavaScript side all the numbers are stored as BigInt
s, which do not overflow (apparently the limit is a few gigabytes worth of bits?) and all the LEB128 intercommunication on that side works fine. These numbers may be larger than u64::MAX
, but they are still valid LEB128 integers and produce correct output when used with other libraries/languages. It would make sense to read the entire LEB128 value instead of cutting it up and potentially causing issues.
The error handling could be changed to keep reading bytes until the continuation bit is clear, but I'm unsure what the behaviour should be if a read error then occurs while doing that.
I think that a good approach would be to store the "overflow state" in a boolean variable and then to continue processing the entire LEB128 value as if there was no overflow. The overflow boolean can then be checked when the function is about to return.
I've come up with a new solution that involves a simple while loop that reads until the end of the LEB128 value, meaning that valid LEB128-encoded values cannot be accidentally split in half, even if they are too big to fit into a u64
.
See my pull request here: https://github.com/gimli-rs/leb128/pull/15
I'm currently using
leb128
to read LEB128-encoded numbers from a stream of data (TcpStream) and I encountered a serious bug that caused my application to generate corrupt/incorrect output and could possibly allow for a DoS attack (in my case).To describe this bug, assume that
cursor
is my TcpStream and that I am attempting to read TWO (2) LEB128-encoded numbers from it.Without an overflow, this library works fine:
However, when an overflow occurs while reading a very long LEB128 value, a phantom 3rd number appears!
This happens because both
leb128::read::signed
andleb128::read::unsigned
exit early if an overflow occurs:The condition that causes
return Err(Error::Overflow);
to execute can evaluate totrue
before the entire LEB128 value has been read, leaving behind extra bytes that can cause serious issues.