Closed v1gnesh closed 1 year ago
Hey,
Thanks for the kind words!
The crates you mention are really cool indeed. At first glance they do not seem to offer this split between data and format as serde does. So I do not see an obvious way to convert from deku / binrw data to arrow directly.
If you're talking about deku / binrw -> Rust -> arrow, then sure: you can use serde_arrow as is, you just need to specify the schema of your objects. Either by tracing a couple of examples using serialize_to_fields
or by building the schema yourself. Then you use deku / binrw to construct the Rust objects and use serde_arrow to build the arrow arrays that correspond from these objects.
Thank you, yeah I mean this option -- deku / binrw -> Rust -> arrow. If you have time, could you share an example of how I'd go about doing this. I'm pretty noob-ish with programming in general. My use case has a whole bunch of nested struct types, of binary log data.
Will post about the first method in those 2 projects and see what they think..
With serde_arrow, you have to ensure all your types implement serialize / deserialize, i.e., by using serde's derive macros. Then you can simply follow the example in the readme:
let fields = serialize_into_fields(&items, TracingOptions::default())?;
let arrays = serialize_into_arrays(&fields, &items)?;
Important for step 1: if you have enums and lists you must make sure all lists have at leas a single entry and all relevant enum variants are encountered.
If you control the whole code base, maybe also arrow2-convert would be an option. You can easily convert from arrow2 to arrow.
Closing this issue, as there is no change necessary in serde_arrow as far as I can tell.
Hi,
Firstly, thank you for building this in the open & sharing!
I see that this can be used to serde-derivable structures to the arrow layout.
There are a ways to parse binary content into Rust data types. Additionally, there is https://github.com/simd-lite/simd-json-derive for deriving JSON from Rust data types.
Would I be able to convert a bunch of structs "created" by them, and then use serde_arrow's derive on top of that, to convert it finally to the arrow layout?