helins / binf.cljc

Handling binary formats in all shapes and forms
Mozilla Public License 2.0
132 stars 2 forks source link

Read/write clojure data structures from/to a view #6

Open sh54 opened 3 years ago

sh54 commented 3 years ago

Adding functions to helins.binf.cabi that read/write a clojure data structure according to some layout.

Example

(let [v some-binf-view
      env (binf.cabi/env w32)
      layout ((binf.cabi/struct :my-struct
                                [[:a binf.cabi/i32]
                                 [:b binf.cabi/i32]]) env)
      e {:a 10 :b 20}]
  (binf.cabi/wr-cabi v layout e)
  (binf/seek v 0)
  (let [r (binf.cabi/rr-cabi v layout)]
    (t/is (= e r))))

Notes

I have some code in another project that was building up layouts similarish to helins.binf.cabi. I thought I would look into switching over to that. I also had some functionality to read and write to memory some clojure data structure according to some layout which was not too tricky to port over.

I have done the necessary work so that it can be used for GPU friendly layouts. The main blocker there was around representing things like a vector which needs to be aligned according to its size rather than the alignment of the numbers it holds.

So a vector of 3 x f32 has a size of 12 bytes and a desired alignment of 16 bytes.

Thus it should look something like:

{:binf.cabi/align 16,
 :binf.cabi/n-byte 12,
 :binf.cabi/type :array,
 :binf.cabi.array/element #:binf.cabi{:align 4, :n-byte 4, :type :f32},
 :binf.cabi.array/n-element 3}

This does not seem possible to produce with the existing functions in the cabi namespace. Something like (force-align (array f32 3) 16) just results in a :binf.cabi/align of 4 on the resulting layout.

I added the function (vector type description-fn n-element stride) to help with this. e.g. (vector :vec3 f32 3 16). It just produces an array but :binf.cabi/align is set to stride and type just provides a touch of metadata. I think I am happy with it producing an array rather than introducing a new :binf.cabi/type. Then a matrix is just a vector of a vector.

Performance has not been examined. Some transients could be used here and there on the read side. For my use I have just needed to read/write small things anyway.

There is no support for read or write of pointers.

Some more thought may be put into reading unions.

helins commented 3 years ago

That looks really (really!) neat!

I've been meaning to ultimately do something like that but didn't have the time nor the practical need, so it is really awesome that you have been able to write that and use it for some actual work.

I'll try to find the time for reviewing more thoroughly but it looks great. One idea that stems to mind is that it seems that it would be quite easy building on that for producing R/W functions, sort of "compiling" a layout if you will as opposed to interpreting it on every operation.

sh54 commented 3 years ago

Thanks for the kind feedback.

Regarding compiling a layout I guess I see two options. A macro that builds an optimal function from a layout. Or a function that generates a set of instructions from a layout and "virtual machine" function that executes them. I am not sure if either is worth it though.

I may also have a big blind spot to a totally different way of going about this!

Some structures and their most efficient readers:

(I have not actually tried any of this)

(def my-vertex
  (binf.cabi/struct :my-struct
                    [[:x binf.cabi/f32]
                     [:y binf.cabi/f32]]))

(def my-struct
  (binf.cabi/struct :my-struct
                    [[:count binf.cabi/i32]
                     [:min binf.cabi/f32]
                     [:max binf.cabi/f32]
                     [:vertices (binf.cabi/array my-vertex 2)]]))

(defn rr-my-vertex
  [view]
  (let [x (binf/rr-f32 view)
        y (binf/rr-f32 view)]
    {:x x
     :y y}))

(defn rr-my-struct
  [view]
  (let [count (binf/rr-i32 view)
        min (binf/rr-f32 view)
        max (binf/rr-f32 view)
        vertices (loop [vertices (transient [])
                        i 0]
                   (if (< i 2)
                     (recur (conj! vertices (rr-my-vertex view)) (inc i))
                     (persistent! vertices)))]
    {:count count
     :min min
     :max max
     :vertices vertices}))

Macro way

It should be possible enough to generate something like rr-my-struct at macro expansion time. That would not be too hard if you directly feed it a literal map like {:binf.cabi/align 1, :binf.cabi/n-byte 1, :binf.cabi/type :i8}. But more awkward if you want to feed it a structure built up by all the functions in the cabi namespace.

So writing the necessary macro to get the following to work is no big deal:

(def rr-my-struct
  (compile-rr
   {:binf.cabi/align         4,
    :binf.cabi/n-byte        12,
    :binf.cabi/type          :struct,
    :binf.cabi.struct/layout [:a :b :c :d],
    :binf.cabi.struct/member+
    {:a #:binf.cabi{:align 1, :n-byte 1, :type :u8,  :offset 0},
     :b #:binf.cabi{:align 2, :n-byte 2, :type :i16, :offset 2},
     :c #:binf.cabi{:align 4, :n-byte 4, :type :u32, :offset 4},
     :d #:binf.cabi{:align 1, :n-byte 1, :type :i8,  :offset 8}},
    :binf.cabi.struct/type   :foo}))

But compile-rr would need the literal map which is more something that you generate with the functions in the cabi namespace.

But I would think it would be more awkward to get something like:

(def rr-my-struct
  (compile-rr
   ((binf.cabi/struct :foo
                      [[:a binf.cabi/bool]
                       [:b binf.cabi/u16]
                       [:c binf.cabi/i64]
                       [:d binf.cabi/u8]])
    env32)))

to work, especially if you want to define a struct in one variable then reuse it in other structures.

With the appropriate macros it should be possible to write something like this:

(deflayout my-vertex
  [:struct :my-vertex [[:x binf.cabi/f32]
                       [:y binf.cabi/f32]]])

(deflayout my-struct
  [:struct :my-struct [[:count binf.cabi/i32]
                       [:min binf.cabi/f32]
                       [:max binf.cabi/f32]
                       [:vertices [:array my-vertex 2]]]])

(defrr rr-my-struct
  my-struct {:binf.cabi/align 4})

But then there is a parallel way of defining layouts.

VM way

I don't see it possible to generate exactly the optimal rr-my-struct at runtime without eval. I guess you could compile the structure to a list of "virtual machine" instructions then have a function that just iterates through those. That should be somewhat slower than the optimal handwritten version and somewhat faster than interpreting the structure each time. Not sure if the speedup would be worth the implementation cost and additional added code.

pseudocode:

(defn compile-cabi
  [cabi]
  [...list-of-instructions...])

(defn rr-instructions
  [view instructions]
  (loop [instructions instructions]
    (case (xyz instruction)
      :i32 (binf/rr-i32 view)
      ...)
    ...))

(defn compile-rr
  [cabi]
  (let [instructions (compile-cabi cabi)]
    (fn [view]
      (rr-instructions view instructions))))

(def rr-my-struct (compile-rr my-struct))