The docs seem to suggest that setting lazy-apps=true is needed for high availability.
We were seeing absolutely no movement on jobs except when reloading status pages. Sometimes we were also seeing the error below.
Chunks would seem to be uploaded roughly around when we reload the status page or submit new jobs, suggesting http requests to pusher resolved some lock, but only for one chunk to progress.
setting lazy-apps=true fixed it.
The error we sometimes got:
pid: 239964|app: 0|req: 2/2] 127.0.0.1 () {36 vars in 435 bytes} [Thu Aug 19 20:16:12 2021] POST /job => generated 759 bytes in 25 msecs (HTTP/1.1 200) 2 headers in 72 bytes (2 switches on core 1)
Fetching from: http://example.org/dataset/17c5b499-d4ac-4551-a106-0a61b6045ac7/resource/3f5e0eaa-0f53-438a-83e2-a1271c66b445/download/finpos_2020q4_acrmun.csv
Error notifying listener
Traceback (most recent call last):
File "/usr/lib/ckan/datapusher/lib/python3.8/site-packages/apscheduler/scheduler.py", line 512, in _run_job
retval = job.func(*job.args, **job.kwargs)
File "/usr/lib/ckan/datapusher/src/datapusher/datapusher/jobs.py", line 432, in push_to_datastore
existing = datastore_resource_exists(resource_id, api_key, ckan_url)
File "/usr/lib/ckan/datapusher/src/datapusher/datapusher/jobs.py", line 228, in datastore_resource_exists
raise HTTPError(
datapusher.jobs.HTTPError: <unprintable HTTPError object>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/ckan/datapusher/lib/python3.8/site-packages/apscheduler/scheduler.py", line 239, in _notify_listeners
cb(event)
File "/usr/lib/ckan/datapusher/lib/python3.8/site-packages/ckanserviceprovider/web.py", line 189, in job_listener
db.mark_job_as_errored(job_id, error_object)
File "/usr/lib/ckan/datapusher/lib/python3.8/site-packages/ckanserviceprovider/db.py", line 413, in mark_job_as_errored
_update_job(job_id, update_dict)
File "/usr/lib/ckan/datapusher/lib/python3.8/site-packages/ckanserviceprovider/db.py", line 348, in _update_job
job_dict["error"] = json.dumps(job_dict["error"])
File "/usr/lib/python3.8/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python3.8/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.8/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python3.8/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Response is not JSON serializable
Job "push_to_datastore (trigger: RunTriggerNow, run = True, next run at: None)" raised an exception
Traceback (most recent call last):
File "/usr/lib/ckan/datapusher/lib/python3.8/site-packages/apscheduler/scheduler.py", line 512, in _run_job
retval = job.func(*job.args, **job.kwargs)
File "/usr/lib/ckan/datapusher/src/datapusher/datapusher/jobs.py", line 432, in push_to_datastore
existing = datastore_resource_exists(resource_id, api_key, ckan_url)
File "/usr/lib/ckan/datapusher/src/datapusher/datapusher/jobs.py", line 228, in datastore_resource_exists
raise HTTPError(
datapusher.jobs.HTTPError: <unprintable HTTPError object>
version: ubuntu package 2.9.3-py3-focal1
The docs seem to suggest that setting lazy-apps=true is needed for high availability.
We were seeing absolutely no movement on jobs except when reloading status pages. Sometimes we were also seeing the error below.
Chunks would seem to be uploaded roughly around when we reload the status page or submit new jobs, suggesting http requests to pusher resolved some lock, but only for one chunk to progress.
setting lazy-apps=true fixed it.
The error we sometimes got: