clojurewerkz / buffy

Buffy The ByteBuffer Slayer, Clojure library for working with binary data.
194 stars 13 forks source link

Proper bit type? #7

Closed Jell closed 10 years ago

Jell commented 10 years ago

Is it possible to have a "bit" field as in "there are 8 bits in a byte and I want only one"?

It is very surprising for me that "bit-type n" creates a field of "8*n" bits. Why not "n" bits? Even "boolean" actually becomes a byte.

My understanding is that fields in buffy must be multiples of 8 bits, but that the only actual restriction should be that the sum of all the fields in a spec is a multiple of 8?

ifesdjeen commented 10 years ago

Is it possible to have a "bit" field as in "there are 8 bits in a byte and I want only one"?

In Java, there's pretty much no way to go down to particular bits. You will get a byte as a smallest granularity, nothing less. But yes, it is totally possible. Just say (bit-type 1), and use a single bit.

It is very surprising for me that "bit-type n" creates a field of "8*n" bits. Why not "n" bits? Even "boolean" actually becomes a byte.

This statement is incorrect. (bit-type n) is a constructor. We can only allow you to adjust the size of your bit field to multiple of 8. By giving n you only specify a size of your bit-mask. After that, you operate bit-masks, and can have your own conversions and composition. So having 8*2 size bit-type gives you an opportunity to use 16 booleans.

My understanding is that fields in buffy must be multiples of 8 bits, but that the only actual restriction should be that the sum of all the fields in a spec is a multiple of 8?

In Java, there's no way to operate individual bits except for through bit-shifts and boolean ops. Disallowing anything that's not a byte as a smallest building block is rather a limitation of JVM and is generally accepted in computing. Even people doing C operate bytes, haven't seen people allocating a memory block of 13 bits, tbh.

michaelklishin commented 10 years ago

I think the point is that it the entire payload should be byte aligned but you can add, say, 8 bit fields instead of a single 8 bit field.

I don't know how much value there is in such a feature, but at least theoretically it is possible. If there is no alignment, an exception should be thrown.

I -1 this as it will increase both complexity and add performance overhead for something a very small % of use cases need.

ifesdjeen commented 10 years ago

Ok, I misunderstood it a bit, edited my response now.

I first started with adjusting size to quotient multiplied by 2, but decided to refrain from that, since when people deal with bytes, they want to see the size.

Although specifying n in bit-field doesn't contradict to anything, and keeps things consistent across an entire library. You just specify byte size all the time. Bit size is calculated by multiplying by 8. If you want to use a single bit, create a byte field and have a carried waste 7 bits.

Hope that helps.

Jell commented 10 years ago

Hum I guess I did not express my problem very well. Here is my use case: I'm trying to communicate with a device over a serial port, and the data I need to send looks like this (with number of bits):

8 message type | 2 offset flags | 8 speed | 1 forward/backward | 1 on/off | 8 speed | 1 forward/backward | 1 on/off | 2 padding

So the whole message is 32 bits long, but the fields are not 8-bits aligned. I do not want to use a (bit-type 4) because then I have to do all the bit-wise manipulations myself (which kind of defeats the point of using this library).

What I would love to write is this:

(spec :type (enum-type (byte-type) {0x1 :TYPE1, 0x2 :TYPE2})
         :offset (n-bits-type 2)
         :m1-speed (byte-type)
         :m1-reverse? (n-bits-type 1)
         :m1-on? (n-bits-type 1)
         :m2-speed (byte-type)
         :m2-reverse? (n-bits-type 1)
         :m2-on? (n-bits-type 1)
         :padding (n-bits-type 2))

Instead of:

(spec :type (enum-type (byte-type) {0x1 :TYPE1, 0x2 :TYPE2})
         :payload (bit-type 3))
;; bit-manipulation ensues
ifesdjeen commented 10 years ago

I would say, I see just a single reasonable solution: allocate 3 bytes log bit field (since enum is quite fine where it is) and implement your own data type, which will use bit-field internally.

The data type will just convert all possible required data types (for example, integers) to sequences of 1s and 0es, to bitmasks. This way you can also employ single-bit fields internally.

So for example:

[bit-on bit-off 1 2]

Will expand to something like

[1 0 0000 0000 0000 0001  0000 0000 0000 0010]

You can pass these straight to the bit-type as value, and write it. In order to convert a single byte to bitmask, you can ask same code we use internally: https://github.com/clojurewerkz/buffy/blob/master/src/clojurewerkz/buffy/types.clj#L96-L100

Despite the fact it's not implemented natively, it's quite easy to implement on the side. After all, all the functionality required is already there.

I'll think a bit more if it's possible to generalize solution without involving conversion to the bitmask.

Jell commented 10 years ago

I see. Would it be possible then to provide a "bit-map-type" that would work as follows:

(spec :type (enum-type (byte-type) {0x1 :TYPE1, 0x2 :TYPE2})
         :payload (bit-map-type :offset 2 :m1-speed 8 ...))

Then:

(set-field buffer :payload {:offset 2r00, :m1-speed 255, ...})
ifesdjeen commented 10 years ago

I've added to-bit-map and from-bit-map convertors for you for all the available data types. You can use them.

If you want, you can work on patch for bit-map-type based on similar solution.

For now, please check: https://github.com/clojurewerkz/buffy#working-with-bytes and last released version.

ifesdjeen commented 10 years ago

Correct link: https://github.com/clojurewerkz/buffy#working-with-bits

Jell commented 10 years ago

Thanks for your help! I'll try to work on a bit-map-type when I get the time :)

Cheers!