Open liuzicheng1987 opened 8 months ago
It would be very valuable to support reading Apache Arrow in-memory tables using reflect-cpp
. Could you share an idea of what kind of work this would require?
As the Arrow library already has high-quality Parquet and CSV readers, we could get those for free, too.
Hi @SChakravorti21, I'm sorry, didn't see your comment until now.
Yes, I am familiar with Apache Arrow. Basically, you would have to check the Arrow Table schema against the schema of our C++ structs. Then you would have to go through the columns and read the fields into the structs.
The challenge here is that the way reflect-cpp works on structs implies "row-major order", in other words a vector of structs. But we want column-major order.
In addition to the more complex formats, we would also like to support tabular formats like parquet or CSV. But currently, we don't even have an interface and concepts for that. These will have to be simplified version of our current parsing module.