okfn / messytables

Tools for parsing messy tabular data. This is now superseded by https://github.com/frictionlessdata/tabulator-py
http://messytables.readthedocs.io/
387 stars 110 forks source link

Patches from HealthData.gov #17

Closed JoshData closed 12 years ago

JoshData commented 12 years ago

Over at hub.healthdata.gov we're trying out the CKAN DataStore. The CSV resources we currently list in the data catalog are extremely messy, a little too messy for messytables.

Here are various patches that I've needed to make in order to make any headway at all with the live data I am trying this on. I will probably have more patches coming....

pudo commented 12 years ago

Hi Josh, thanks for the massive improvements. My only question would be about the changes in type guessing weights - can you explain this? Shouldn't this at least be higher than string?