ckan / datapusher

A standalone web service that pushes data files from a CKAN site resources into its DataStore
GNU Affero General Public License v3.0
77 stars 155 forks source link

DataPusher should not deliberately maintain uppercase of field names #78

Closed timwis closed 9 years ago

timwis commented 9 years ago

I've noticed that if you upload a dataset with uppercase field names (which is almost always the case when pulling data from Esri/Arc databases), they get added to the postgres database with uppercase names. By default, postgres converts all field names to lowercase unless you explicitly double quote the field names. If you do double quote them, then you always need to use double quotes to interact with them.

As a result, you cannot query or use the DataStore API as the documentation suggests:

?sql=SELECT * from "2c0d8231-e6b8-4598-8089-d1bcaf1bdaa6" WHERE ZIP_CODE = '19146' (example) will fail with the error column "zip_code" does not exist (note the error includes a lowercase field name - postgres converts field names to lowercase unless they're double quoted)

You must wrap the field name in double quotes to query it, ie. ?sql=SELECT * from "2c0d8231-e6b8-4598-8089-d1bcaf1bdaa6" WHERE "ZIP_CODE" = '19146' (example)

My hypothesis is that somewhere in the DataPusher extension, field names are being wrapped in double quotes, and my proposal is that they not be wrapped in double quotes. If there's some other reason they need to be (ie. to support special characters) then the field names should be converted to lowercase first to provide the expected functionality. Thoughts?

amercader commented 9 years ago

This is not handled on the DataPusher but at the DataStore level, so moving to https://github.com/ckan/ckan/issues/2577