dmendel / bindata

BinData - Reading and Writing Binary Data in Ruby
BSD 2-Clause "Simplified" License
577 stars 55 forks source link

Zero-length integer types are allowed, but do not work #84

Closed ktims closed 7 years ago

ktims commented 7 years ago

BinData allows the use of Uint0 types, but (at least) num_bytes is not functional with them due to BinData::Int.define_methods failing (I think).

I'm not a Ruby developer, I came across this issue trying to debug a parsing issue in logstash-netflow, where a dynamic field set as length 0 was crashing the parser, so take the commentary with a grain of salt.

Here is a simple test case:

class TestClass < BinData::Record
  endian :big
  uint0 :len
end

irb(main):016:0> a = TestClass.new(:len => 0)
=> {:len=>0}
irb(main):017:0> a.num_bytes
NotImplementedError: NotImplementedError
        from /usr/share/logstash/vendor/bundle/jruby/1.9/gems/bindata-2.3.4/lib/bindata/base_primitive.rb:229:in `value_to_binary_string'
        from /usr/share/logstash/vendor/bundle/jruby/1.9/gems/bindata-2.3.4/lib/bindata/base_primitive.rb:135:in `do_num_bytes'
        from /usr/share/logstash/vendor/bundle/jruby/1.9/gems/bindata-2.3.4/lib/bindata/struct.rb:257:in `sum_num_bytes_below_index'
...

I believe this is caused by the code around int.rb:129 and similar blocks. In case of 0 length, vals here is [], so the next reference to vals[0] is nil and thus doesn't have the called method. This causes all (or some of? not sure on the semantics of module_eval) the defined methods to not get defined, and BinData::BasePrimitive methods to be called instead, which ultimately fails later in num_bytes (and probably elsewhere).

It looks like adding unless vals.length.zero? after the substitution should solve the problem and generate correct output.

dmendel commented 7 years ago

Uint0 is nonsensical. Thanks for the report.

Fixed in 351c2ba.

ktims commented 7 years ago

In the case that you do not want to support it, it should at least be bounds-checked at definition time and provide a sane exception, not a somewhat difficult to track down one when calling inspect or num_bytes.

I don't necessarily agree that it is nonsensical; it does not carry any information, but it does have a clear and well defined meaning. It's a number that takes no storage and is always zero; an empty placeholder. In dynamic TLV data structures, where one fixed structure is used to define 'templates' for other dynamic structures that are in use, it may be necessary to indicate that a field has zero length. This can be special-cased of course, but it seems sensible to allow it in BinData, to allow constructions like netflow.rb:245:

template.scope_fields.each do |field|
            fields << [uint_field(0, field.field_length), NETFLOW9_SCOPES[field.field_type]]
end

The format cannot be changed, and some consumers may expect the field to be available, so the special casing gets a bit ugly.

I have fixed this for my own use by checking the vals.length as I suggested. If you would rather not support this, I think you should assert it when the type is created. I am not sure what upstream logstash-netflow intends to do.

dmendel commented 7 years ago

The fix I committed above does provide definition time checks

p BinData::Uint8be #=> "BinData::Uint8be"
p BinData::Uint7be #=> NameError
p BinData::Uint0be #=> NameError

An integer of zero bytes doesn't have a well defined meaning. It's value could arguably be 0 or NaN, depending on expectation.

No common programming language defines an integer datatype consuming zero bytes, so it's not a general case for BinData to provide.

If you wish for this functionality then you should explicitly define it in your own code.

module BinData
  class Uint0 < BinData::BasePrimitive
    def do_read(io); end
    def do_write(io); end

    def do_num_bytes
      0
    end

    def sensible_default
      0
    end
  end
end
ktims commented 7 years ago

Thanks, I'll try to implement this fix downstream.