ckan / datapusher

A standalone web service that pushes data files from a CKAN site resources into its DataStore
GNU Affero General Public License v3.0
77 stars 155 forks source link

Day of week (& other date fields) recast as timestamps #75

Open richplane opened 9 years ago

richplane commented 9 years ago

Our client is uploading CSV data to CKAN which includes a column containing a day of the week.

Datapusher (via messytables) appears to decide that this is a date column and converts it to a timestamp. "Wednesday" appears to be interpreted as "This Wednesday" and is replaced by a timestamp in the form 2015-06-03T00:00:00 - which is useless to anyone reading the data, and frequently misleading.

Since the messytables code seemed to say that DateType includes format detection, I tried altering the configuration to use that instead of the DateUtilType that Datapusher uses normally, by adding:

TYPES = [messytables.StringType, messytables.DecimalType, messytables.IntegerType, messytables.DateType, messytables.BoolType]
TYPE_MAPPING = {
    'String': 'text',
    'Integer': 'numeric',
    'Decimal': 'numeric',
    'Date': 'timestamp'
}

into datapusher_settings.py. But it made no difference.

Is there any way of ensuring that dates get recognised as dates (and don't get given a spurious 00:00:00 timestamp), and days get left as days (strings)?

gjlawran commented 8 years ago

Confirmed this issue. A letter 'T' in cell, as might be used to indicate 'true' or 'false' - is also interpreted as a timestamp datatype.

fxsman commented 5 years ago

We still experience this problem on version 2.7.3. For example datapusher changes the data on CSV file from 11.5.1998 to timestamp 1998-11-05T00:00:00 Is this fixed on newer versions?