Closed stefan-kolb closed 8 years ago
BinData doesn't provide read-ahead and pushback for streams, which is what you'd need for your use case.
It's a useful feature so I'll consider adding it.
If your steam is seekable, you could do two passes. The first pass would find the offsets for your specific byte sequences and the second pass would perform the actual parsing.
Yeah, that's what im looking for. Something like skip :to => byte_sequence
where we seek from the current stream position with a ring buffer of the size of the byte_sequence
and set the position to the next byte after the sequence.
Pushback would also be nice :smile:
Btw, thanks a lot for your work on this awesome library :+1:
Added in dmendel@4232239.
You can now skip to any BinData expression, not just a byte sequence. Syntax is:
class A < BinData::Record
skip do
string :read_length => 4, :assert => "abcd"
end
# we are now aligned to 'abcd'
end
Wow, that was fast :smile: ! Thank you so much! Just two questions:
:read_length
here? Shouldn't this be determined by the size of the assertion?:read_length
chunks?! If so, we would not find chunks like cde
in abcdef
if we set read_length
to 3 and start from zero offset. Maybe an explicit test for this should be added.
- Do we really need a
:read_length
here? Shouldn't this be determined by the size of the assertion?
Yes we do need :read_length for clarity when using multibyte characters. This was discussed previously here: https://github.com/dmendel/bindata/issues/40
- The code does not only search in
:read_length
chunks?!
There's a more detailed example in the wiki. https://github.com/dmendel/bindata/wiki/AdvancedIO#skipping-over-unused-data
If so, we would not find chunks like
cde
inabcdef
if we setread_length
to 3 and start from zero offset.
The search strategy is byte by byte, not chunk by chunk. If you find a case where it doesn't work, please file a bug.
Great, thanks for the explanation! :+1:
Hi,
I'm having a stream of bytes with different data structures for some of which I don't know the exact structure. however, I know certain byte constants that will occur inside the stream. Is there any way to seek for a specific byte sequence inside a stream an skip to it?