alan-turing-institute / CleverCSV

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
https://clevercsv.readthedocs.io
MIT License
1.25k stars 72 forks source link

delimiter error for json type data #37

Closed hcheng2002cn closed 3 years ago

hcheng2002cn commented 3 years ago

we have attached file, clevercsv guess delimiter ':', instead of ',' The three data type are json, time without time zone and time with time zone.

"{""fake"": ""json"", ""fake2"":""json2""}",13:31:38,06:00:04+01:00
"{""fake"": ""json"", ""fake2"":""json2""}",22:13:29,14:20:11+02:00
"{""fake"": ""json"", ""fake2"":""json2""}",04:37:27,22:04:28+03:00
"{""fake"": ""json"", ""fake2"":""json2""}",04:25:28,23:12:53+01:00
"{""fake"": ""json"", ""fake2"":""json2""}",21:04:15,08:23:58+02:00
"{""fake"": ""json"", ""fake2"":""json2""}",10:37:03,11:06:42+05:30
"{""fake"": ""json"", ""fake2"":""json2""}",10:17:24,23:38:47+06:00
"{""fake"": ""json"", ""fake2"":""json2""}",00:02:51,20:04:45-06:00

Would you please help to take a look ?

Thanks hong

hcheng2002cn commented 3 years ago

@Giovanni1085 attached is data type we think might occur at csv files: Number:

Text: 'ScBiSTZZXkjuRWQUnTze' Char: 'a'

Bool: False, True Date: '2014-08-15' time_no_zone: '19:38:12' time_with_zone: '05:14:35+01:00' timestamp_no_zone: '2007-04-14T06:00:53' timestamp_with_zone: '1973-11-12T05:14:35+01:00' json: "{""fake"": ""json"", ""fake2"":""json2""}"

hcheng2002cn commented 3 years ago

for recognized type, can we differentiate the score ? since basic everything can be detect as text, and time/timestamp ... is more precise.

Thanks

GjjvdBurg commented 3 years ago

Apologies for the slow response on this @hcheng2002cn! This has now been fixed in CleverCSV version 0.7.0.

hcheng2002cn commented 3 years ago

thanks for fix ! will try out.