alan-turing-institute / CleverCSV

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
https://clevercsv.readthedocs.io
MIT License
1.25k stars 72 forks source link

Built-in support for cChardet #48

Closed cstork closed 3 years ago

cstork commented 3 years ago

I noticed that in my applications most of the runtime of CleverCSV is used up by chardet. Since uChardet is much faster it might be worth supporting it by default. (I know that it's possible to determine the encoding first with cChardet and then pass it to CleverCSV. I'm just suggesting this as a possible enhancement – maybe also for people not aware of cChardet.)

GjjvdBurg commented 3 years ago

Thanks for suggesting this @cstork, I'll take a look!

GjjvdBurg commented 3 years ago

I've added cChardet as an optional dependency, so it'll be used when it's available on the system. Hope this helps!