ckan / ckanext-xloader

Express Loader - quickly load data into DataStore. A replacement for DataPusher.
GNU Affero General Public License v3.0
44 stars 50 forks source link

Getting KeyError: 'user' on all resource push to datastore #126

Closed rhabbachi closed 8 months ago

rhabbachi commented 3 years ago

Hello,

I'm triggering a KeyError exception during any resource update. The data is pushed to the datastore correctly but updating the resource status is halted with the exception.

CKAN: 2.9.2 Python: 2.7 ckanext-xloader: 0.7

Express Load starting: /dataset/test-dataset/resource/72e64a98-10b8-492f-bea0-ba8fbe4e1ff9
2020-12-13 12:42:51,485 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Express Load starting: /dataset/test-dataset/resource/72e64a98-10b8-492f-bea0-ba8fbe4e1ff9
Fetching from: http://pno.docksal/dataset/8a15861d-6742-42cb-adfd-cc3e6dce75f4/resource/72e64a98-10b8-492f-bea0-ba8fbe4e1ff9/download/test-1.csv
2020-12-13 12:42:51,488 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Fetching from: http://pno.docksal/dataset/8a15861d-6742-42cb-adfd-cc3e6dce75f4/resource/72e64a98-10b8-492f-bea0-ba8fbe4e1ff9/download/test-1.csv
Downloaded ok - 161.0 bytes
2020-12-13 12:42:51,553 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Downloaded ok - 161.0 bytes
File hash: 51f51011dec4b853ea0493c037a62df5
2020-12-13 12:42:51,556 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] File hash: 51f51011dec4b853ea0493c037a62df5
Loading CSV
2020-12-13 12:42:51,558 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Loading CSV
'Just load with messytables' mode is: False
2020-12-13 12:42:51,561 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] 'Just load with messytables' mode is: False
Ensuring character coding is UTF8
2020-12-13 12:42:51,581 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Ensuring character coding is UTF8
Fields: [{'type': 'text', 'id': 'indicateur'}, {'type': 'text', 'id': 'annA(c)e'}, {'type': 'text', 'id': 'valeur'}]
2020-12-13 12:42:51,602 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Fields: [{'type': 'text', 'id': 'indicateur'}, {'type': 'text', 'id': 'annA(c)e'}, {'type': 'text', 'id': 'valeur'}]
Copying to database...
2020-12-13 12:42:51,653 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Copying to database...
...copying done
2020-12-13 12:42:51,664 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] ...copying done
Creating search index...
2020-12-13 12:42:51,667 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Creating search index...
...search index created
2020-12-13 12:42:51,673 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] ...search index created
Calculating record count (running ANALYZE on the table)
2020-12-13 12:42:51,676 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Calculating record count (running ANALYZE on the table)
Setting resource.datastore_active = True
2020-12-13 12:42:51,682 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Setting resource.datastore_active = True
Setting resource.datastore_contains_all_records_of_source_file = True
2020-12-13 12:42:51,685 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Setting resource.datastore_contains_all_records_of_source_file = True
Data now available to users: /dataset/test-dataset/resource/72e64a98-10b8-492f-bea0-ba8fbe4e1ff9
2020-12-13 12:42:51,950 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Data now available to users: /dataset/test-dataset/resource/72e64a98-10b8-492f-bea0-ba8fbe4e1ff9
Creating column indexes (a speed optimization for queries)...
2020-12-13 12:42:51,952 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] Creating column indexes (a speed optimization for queries)...
...column indexes created.
2020-12-13 12:42:51,969 INFO  [bd78d87f-66aa-441c-ba17-c90eb41f89c3] ...column indexes created.
2020-12-13 12:42:51,978 ERROR [ckanext.xloader.jobs] xloader error: 'user', Traceback (most recent call last):
  File "/usr/src/app/venv/src/ckanext-xloader/ckanext/xloader/jobs.py", line 79, in xloader_data_into_datastore
    xloader_data_into_datastore_(input, job_dict)
  File "/usr/src/app/venv/src/ckanext-xloader/ckanext/xloader/jobs.py", line 220, in xloader_data_into_datastore_
    direct_load()
  File "/usr/src/app/venv/src/ckanext-xloader/ckanext/xloader/jobs.py", line 189, in direct_load
    patch_only=True)
  File "/usr/src/app/venv/src/ckanext-xloader/ckanext/xloader/jobs.py", line 485, in update_resource
    get_action(action)(context, resource)
  File "/usr/src/app/venv/src/ckan/ckan/logic/__init__.py", line 473, in wrapped
    result = _action(context, data_dict, **kw)
  File "/usr/src/app/venv/src/ckan/ckan/logic/action/patch.py", line 70, in resource_patch
    'user': context['user'],
KeyError: 'user'

Is this a common error folks are getting or did I configure something incorrectly? Thank you for your help!

f-osorio commented 3 years ago

I've run into the same issue and have the same questsions. The only difference in my setup is Python3.

The context that xloader's update_resouce function in jobs.py is sending to CKAN's resource_patch is missing the "user" and "auth_user_obj" information, both of which are required by resource_patch to construct a new context.

I can force that information into the context created in update_resource and then the submission will complete, but doesn't seem very useful way to go about it.

rjruizes commented 3 years ago

Same. Here is the fix I'm using in my fork https://github.com/rjruizes/ckanext-xloader/commit/f56e451ac1b87eb3353285c4368b888d368d441d

---     context = {'model': model, 'session': model.Session, 'ignore_auth': True}
+++     context = {'model': model, 'session': model.Session, 'ignore_auth': True,
                'user': None, 'auth_user_obj': None}
rjruizes commented 3 years ago

Correction: The user to use for background jobs is the site_user https://github.com/rjruizes/ckanext-xloader/commit/360f56a0b2a3847a525a9b33417346e39dc9bd04

f-osorio commented 3 years ago

This fix works for me. Thanks.

ThrawnCA commented 8 months ago

Any objections to closing this?