Wittline / csv-schema-inference

A tool to automatically infer columns data types in .csv files
https://wittline.github.io/csv-schema-inference/
MIT License
33 stars 4 forks source link

Any benchmarks for the large files ? #28

Closed PandaWhoCodes closed 2 years ago

PandaWhoCodes commented 2 years ago

Great project Was wondering if you could provide some benchmarks in the readme for large files >2,3,5 GB csv's so that the community can work towards making things faster.

Wittline commented 2 years ago

Great project Was wondering if you could provide some benchmarks in the readme for large files >2,3,5 GB csv's so that the community can work towards making things faster.

Hi @PandaWhoCodes , I will, Let me try to find some organic datasets first, I will back to you later.

Wittline commented 2 years ago

Great project Was wondering if you could provide some benchmarks in the readme for large files >2,3,5 GB csv's so that the community can work towards making things faster.

Hi @PandaWhoCodes benchmarks are too slow, I will be working on a new version of the code ir order to improve computations times and memory usage

PandaWhoCodes commented 2 years ago

can you share the current benchmarks ?

Wittline commented 2 years ago

can you share the current benchmarks ?

Please check the readme again, benchmarks were added, there is another folder benchmark with a jupyter notebook..