pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
30.19k stars 1.95k forks source link

Create a `DataFrame` from a `Vec` of structs with automatically derived schema #6168

Open oersted opened 1 year ago

oersted commented 1 year ago

Problem description

Say we have a trait like this (there's probably a better name for it)

trait GetSchema {
    fn get_schema() -> Schema;
}

And we have its corresponding derive macro for structs.

It would be nice for a user to create a DataFrame like this:

#[derive(GetSchema)]
struct Record {
    ...
}

let records: Vec<Record> = vec![...];
let df: DataFrame = records.into();

As opposed to having to parse the structs into Rows and manually specifying the schema.

Let me know if I'm missing an existing API that does something similar already.

Perhaps it is possible to leverage the existing interop with serve? I'm not sure since the serde and serde-lazy features are not documented at the moment.

ritchie46 commented 1 year ago

You can try this: https://github.com/DataEngineeringLabs/arrow2-convert

mdekstrand commented 11 months ago

arrow2-convert works reasonably well, but is built on arrow2 directly, so its usage path now that Polars has vendored arrow2/parquet2 is unclear.