okfn / messytables

Tools for parsing messy tabular data. This is now superseded by https://github.com/frictionlessdata/tabulator-py
http://messytables.readthedocs.io/
388 stars 110 forks source link

CKAN Datastore - type autodetection issue #110

Open jjelosua opened 10 years ago

jjelosua commented 10 years ago

Hi,

In an instance of CKAN (2.2) I was trying to push a resource into the Datastore through the DataPusher, but I encountered a formatting error with an invalid timestamp. Having a look at what was going on I ended up in messy tables type_guess function not working on my data.

It tires to parse "0/ 0" into a DateUtilType and crashes: ""Error: CKAN DataStore bad response. Status code: 409 Conflict. At: http://130.206.83.32/api/3/action/datastore_create. Response: {u'error': {u'__type': u'Validation Error', u'data': u'(DataError) invalid input syntax for type timestamp: "0/ 0"\nLINE 2: ... VALUES (\'V20020275501\', 2002, \'2755/ 1\', \'0/ 0\', \'4...\n ^\n', u'info': {u'orig': [u'invalid input syntax for type timestamp: "0/ 0"\nLINE 2: ... VALUES (\'V20020275501\', 2002, \'2755/ 1\', \'0/ 0\', \'4...\n ^\n']}}"

I think that maybe the type detection is mislead by the forward slash in the data, but maybe the way it handles the Date detection is too loose.

I do not know if this issue should be treated inside the messytables or inside the CKAN use of messytables, If you think it is more appropriate on the CKAN repo, just tell me.

Cheers

Juan

domoritz commented 10 years ago

I believe it's a messytables issue.