jabacat / jml

JABACAT-created machine learning library from scratch.
5 stars 5 forks source link

Data loading and handling #11

Open Sophon96 opened 1 year ago

Sophon96 commented 1 year ago

We need some way to load data formats, such as CSV, Apache Parquet, HDF. There's libraries for these in C++, but I'm not sure how well they integrate. I don't know if there's a ubiquitous data science data library like pandas for C++, which might be preferable to use.

adamhutchings commented 1 year ago

Even if we write our own, I think all functions of this sort that are not directly related to machine learning should go in a top-level folder separate from core/. Any good name ideas or thoughts?

Sophon96 commented 1 year ago

I think we shouldn't roll our own. Apache Arrow looks like a pretty cool library. There's also this repo I found https://github.com/hosseinmoein/DataFrame/