CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
Is there a way to use the sniffer with a confidence score threshold? I am noticing that while the library works well for many type of CSV, I have a couple of control cases that aren't CSV at all, fixed-width files actually, where the sniffer is returning a dialect. I'd like to have access to the confidence score of sniffer in order to base my decision on using the returned delimiter.
As a matter of fact, I have ran quite a few files through the sniffer and I haven't got a None response yet, which makes believe the logic is a little bit to eager to produce a dialect, even at low confidence.
Below I show the file on the left alongside with the delimiter on the right.
Is there a way to use the sniffer with a confidence score threshold? I am noticing that while the library works well for many type of CSV, I have a couple of control cases that aren't CSV at all, fixed-width files actually, where the sniffer is returning a dialect. I'd like to have access to the confidence score of sniffer in order to base my decision on using the returned delimiter.
As a matter of fact, I have ran quite a few files through the sniffer and I haven't got a
None
response yet, which makes believe the logic is a little bit to eager to produce a dialect, even at low confidence.Below I show the file on the left alongside with the delimiter on the right.