kutsurak / python-bitstring

Automatically exported from code.google.com/p/python-bitstring
0 stars 0 forks source link

Add support for numbered arrays #131

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I am working with a really complex binary spec (LLRP).

Some of its fields are prefixed by a size followed by that number of data 
elements (array).

It would be really useful to have an syntax to describe these elements in 
bitstring.
Consider adding the following syntax:

    SizeSpec[DataSpec]

That is, a SizeSpec followed by a DataSpec enclosed in square brackets to 
indicate that an array follows.

Examples:

    uintbe:16[bytes:8]    # A 16-bit length array of bytes.  Returns a string (i.e. "abcdef")
    uintbe:32[uintbe:32]  # A 32-bit length array of words.  Returns a list of numbers (i.e. [1,2,3,4])

It would also be great to support composite arrays.  For example:

    uint:32[uint:32, uint:16, uint:16]

    # A 32-bit length array of elements consisting of 32, 16 and 16 bit numbers.  Returns a list of lists (i.e. [[1,2,3],[4,5,6],[7,8,9]])

Finally, and I know this is a tall one, it would be great to support arrays 
inside arrays.  For example:

    uintbe:16[uintbe:8[bytes:8]]
    # A 16-bit length array of 8-bit length prefixed strings.  Returns a list of strings (i.e. ['abc', 'def'])

    uintbe:16[uintbe:32, uintbe:8[bytes:8]]
    # A 16-bit length array of elements consisting of a 32 bit number followed by a length-prefixed string.  Returns a list of lists (i.e. [[12,'Buckle my shoe'], [56, 'Pick up sticks']])

I recognize that this could complicate things *a lot*, but I figured I could 
always ask anyway.

Thanks much.  Your library has saved me days of work already.

Original issue reported on code.google.com by edward.s...@gmail.com on 6 Dec 2012 at 3:31

GoogleCodeExporter commented 9 years ago
Hi, I'm not sure I understand the format you're suggesting, is it this:

uintbe:16[bytes:8] : read 16 bits and interpret as an unsigned big-endian 
integer. Then read that many 8 byte bitstrings.

(The 'bytes' tokens reads in bytes not bits, so I suspect you may have meant 
'bytes:1')

If so then I think my preferred notation would be:

s.readlist('uintbe:16 = a, a*bytes:8')

This is a little bit more flexible and needs less explanation at the expense of 
an added variable, and would pretty much be done if I implement issue 123.

Your composite example then becomes

uint:32=a, a*(uint:32, 2*uint:16)

but this would return a flat list... which isn't really the same thing.

Personally I think that the complexity should go to the code rather than to the 
bitstring syntax - it's easier to write a loop in Python than have one implied 
in an obscure markup:

def readarray(size, fmt):
    return [self.readlist(fmt) for i in range(self.read(size))]

readarray('uint:32', 'uint:32, 2*uint:16')

which can be made into a simple function call if it's going to get used a lot.

So after all that I'm coming to the opinion that it's not a general enough 
usage to justify the change, which in any case sounds a bit difficult!

Glad you're finding the library useful. Cheers, Scott.

Original comment by dr.scott...@gmail.com on 6 Dec 2012 at 4:15

GoogleCodeExporter commented 9 years ago

Original comment by dr.scott...@gmail.com on 9 Dec 2012 at 4:06