hosseinmoein / DataFrame

C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage
https://hosseinmoein.github.io/DataFrame/
BSD 3-Clause "New" or "Revised" License
2.53k stars 313 forks source link

Support HDF5 reading/writing #326

Closed sungho-cho closed 2 months ago

sungho-cho commented 2 months ago

Input/output support for HDF5 (similar to pandas' read_hdf and to_hdf) would be immensely helpful to my team's project.

hosseinmoein commented 2 months ago

I know, this is on my to do list. I don't want to bring a third party library into DataFrame. So I have to implement it myself. I need to find time. Do you know of a good read/write HDF5 library I can look at to get some pointers?

sungho-cho commented 2 months ago

Yeah sure, I know of these two open-source HDF5 libraries, although I just started trying them out.

hosseinmoein commented 2 months ago

Implementing HDF5 from scratch on my side is not practical with the timeframe I have. Also I don't want to include a third part library as a dependency in C++ DataFrame. But if you want to implement C++ DataFrame read/write in HDF5 using one of the above libraries, I can show you how. It is not that involved.

sungho-cho commented 2 months ago

That's understandable. Looking at load_data, I feel like it won't be too much of a hassle for me to write my own HDF5 reader and load it into a DataFrame. I will reach out if there's any issues I need help with. Thanks!