ckan / datapusher

A standalone web service that pushes data files from a CKAN site resources into its DataStore
GNU Affero General Public License v3.0
77 stars 152 forks source link

error while adding csv-files with duplicate column names #58

Open m1h opened 9 years ago

m1h commented 9 years ago

Currently, it is not possible to add csv-files with duplicate column names, since these are also used as table-columns in postgres. In web the datapusher returns an "internal server error 500" at /api/3/action/datastore_create.

rossjones commented 9 years ago

We discussed this at the meeting, and the general consensus was that although it might be possible to append a number to duplicate columns, it would introduce problems later in the process.

The best solution currently is to make sure there is a usable error message so that the user can correct the mistake and re-upload the file. This isn't to say it'll never handle it, but in the foreseeable future this is the safest approach (unless a PR appears from somewhere).

arieljlira commented 8 years ago

same issue with CKAN 2.4.1 . It would be great to get at least an error message pointing the real cause. now I get only:

[Mon Nov 09 11:21:44 2015] [error] CKAN DataStore bad response. Status code: 500 Internal Server Error. At: http://localhost/api/3/action/datastore_create.
[Mon Nov 09 11:21:44 2015] [error] Job "push_to_datastore (trigger: RunTriggerNow, run = True, next run at: None)" raised an exception
[Mon Nov 09 11:21:44 2015] [error] Traceback (most recent call last):
[Mon Nov 09 11:21:44 2015] [error]   File "/usr/lib/ckan/datapusher/lib/python2.7/site-packages/apscheduler/scheduler.py", line 512, in _run_job
[Mon Nov 09 11:21:44 2015] [error]     retval = job.func(*job.args, **job.kwargs)
[Mon Nov 09 11:21:44 2015] [error]   File "/usr/lib/ckan/datapusher/src/datapusher/datapusher/jobs.py", line 389, in push_to_datastore
[Mon Nov 09 11:21:44 2015] [error]     records, api_key, ckan_url)
[Mon Nov 09 11:21:44 2015] [error]   File "/usr/lib/ckan/datapusher/src/datapusher/datapusher/jobs.py", line 205, in send_resource_to_datastore
[Mon Nov 09 11:21:44 2015] [error]     check_response(r, url, 'CKAN DataStore')
[Mon Nov 09 11:21:44 2015] [error]   File "/usr/lib/ckan/datapusher/src/datapusher/datapusher/jobs.py", line 146, in check_response
[Mon Nov 09 11:21:44 2015] [error]     response=response.text)
[Mon Nov 09 11:21:44 2015] [error] HTTPError

Thanks

erdnuesse commented 4 years ago

Current Ckan (2.8) throws the right Error: Response: {"help": "http://h2893531.stratoserver.net/api/3/action/help_show?name=datastore_create", "success": false, "error": {"field": ["Duplicate column names are not supported"], "__type": "Validation Error...

Just encountered it. This may be safe to close - consensus on keeping it and correct error message are sufficient requirements in general to close issues - just my 2ct.