dmendel / bindata

BinData - Reading and Writing Binary Data in Ruby
BSD 2-Clause "Simplified" License
577 stars 55 forks source link

Arrays are fully allocated before checking to see if they can be read #130

Closed byteit101 closed 3 years ago

byteit101 commented 4 years ago
require 'bindata'
class Example < BinData::Record
  endian :little
  array :data, type: :uint8, initial_length: 200_000_000
end
Example.read(StringIO.new(""))

I expected this to immediately throw an EOFError, but instead it hangs around allocating the entire array first, taking several minutes, and only then throws the EOFError (provided it doesn't run out of memory first). The uint8_array type gets this right and does throw immediately.

dmendel commented 4 years ago

Yes. BinData is optimised for the common case where the parsing succeeds.

If your use case expects the parsing to fail, use read_until instead of initial_length.

byteit101 commented 4 years ago

My "use case" was a typo. I do expect the parsing to succeed for normal situations though.

While writing my code, I accidentally had a uint32 length vs expected uint16 length via a copy-paste error. When I ran the script, it was taking suspiciously long and I saw my memory disappearing, then it began swapping out before I managed to ctrl-c it. It was the ux & behavior of the failure that concerned me more than the failure itself, hence why I though of reporting it.

dmendel commented 3 years ago

There are existing uses relying on arrays being fully allocated before being read. Thanks for the report, but I can not make this change while maintaining backward compatibility.