galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.41k stars 1.01k forks source link

Automatically disable `outputs_to_working_directory` when running jobs with pulsar #12123

Open mvdbeek opened 3 years ago

mvdbeek commented 3 years ago

Setting outputs_to_working_directory and running jobs in pulsar is not compatible (there's no? point since the jobs run remotely anyway). To avoid

Jun 10 09:08:36 gat-16.be.training.galaxyproject.eu pulsar[146179]: 2021-06-10 09:08:36,423 DEBUG [pulsar.client.amqp_exchange][[manager=_default_]-[action=postprocess]-[job=770]] [publish:7047c050-c9cb-11eb-bd0a-d770eda86e64] Have producer for publishing to key pulsar__status_update
Jun 10 09:08:36 gat-16.be.training.galaxyproject.eu pulsar[146179]: 2021-06-10 09:08:36,423 DEBUG [pulsar.client.amqp_exchange][[manager=_default_]-[action=postprocess]-[job=770]] [publish:7047c050-c9cb-11eb-bd0a-d770eda86e64] Published to key pulsar__status_update
Jun 10 09:08:36 gat-16.be.training.galaxyproject.eu uwsgi[442169]: pulsar.client.manager DEBUG 2021-06-10 09:08:36,424 [p:442169,w:0,m:2] [pulsar_client__default__status_update_callback] Handling asynchronous status update from remote Pulsar.
Jun 10 09:08:36 gat-16.be.training.galaxyproject.eu uwsgi[442169]: galaxy.jobs WARNING 2021-06-10 09:08:36,429 [p:442169,w:0,m:2] [pulsar_client__default__status_update_callback] (770) Job runner URLs are deprecated, use destinations instead.
Jun 10 09:08:36 gat-16.be.training.galaxyproject.eu uwsgi[442169]: galaxy.jobs.runners.pulsar DEBUG 2021-06-10 09:08:36,429 [p:442169,w:0,m:2] [pulsar_client__default__status_update_callback] (770) Received status update: <class 'str'> complete
Jun 10 09:08:36 gat-16.be.training.galaxyproject.eu uwsgi[442169]: pulsar.client.staging.down DEBUG 2021-06-10 09:08:36,484 [p:442169,w:0,m:2] [PulsarJobRunner.work_thread-1] Cleaning up job (failed [False], cleanup_job [onerror])
Jun 10 09:08:36 gat-16.be.training.galaxyproject.eu uwsgi[442169]: galaxy.jobs.runners DEBUG 2021-06-10 09:08:36,504 [p:442169,w:0,m:2] [PulsarJobRunner.work_thread-1] executing external set_meta script for job 770: GALAXY_LIB="/srv/galaxy/server/lib"; if [ "$GALAXY_LIB" != "None" ]; then if [ -n "$PYTHONPATH" ]; then PYTHONPATH="$GALAXY_LIB:$PYTHONPATH"; else PYTHONPATH="$GALAXY_LIB"; fi; export PYTHONPATH; fi; GALAXY_VIRTUAL_ENV="/srv/galaxy/venv"; if [ "$GALAXY_VIRTUAL_ENV" != "None" -a -z "$VIRTUAL_ENV" -a -f "$GALAXY_VIRTUAL_ENV/bin/activate" ]; then . "$GALAXY_VIRTUAL_ENV/bin/activate"; fi; python "metadata/set.py"
Jun 10 09:08:36 gat-16.be.training.galaxyproject.eu uwsgi[442164]: [pid: 442164|app: 0|req: 392/392] 85.149.66.255 () {50 vars in 1277 bytes} [Thu Jun 10 09:08:36 2021] GET /api/histories/7b1543154e1ea7b2/contents?details=198bc22c2b022bb6%2C5112448398a6eac3%2C827604fb34b01056&order=hid&v=dev&q=update_time-ge&q=deleted&q=purged&qv=2021-06-10T09%3A08%3A32.000Z&qv=False&qv=False => generated 1590 bytes in 19 msecs (HTTP/1.1 200) 3 headers in 139 bytes (1 switches on core 3)
Jun 10 09:08:38 gat-16.be.training.galaxyproject.eu uwsgi[442169]: galaxy.jobs.runners DEBUG 2021-06-10 09:08:38,209 [p:442169,w:0,m:2] [PulsarJobRunner.work_thread-1] execution of external set_meta for job 770 finished
Jun 10 09:08:38 gat-16.be.training.galaxyproject.eu uwsgi[442169]: galaxy.jobs ERROR 2021-06-10 09:08:38,229 [p:442169,w:0,m:2] [PulsarJobRunner.work_thread-1] fail(): Missing output file in working directory: [Errno 2] No such file or directory: '/srv/galaxy/jobs/000/770/outputs/galaxy_dataset_a9a21294-19c4-4425-a951-2d95293c73b3.dat'

We should just disable outputs_to_working_directory if a job runs in pulsar. It's not super-obvious when reading the logs that you need to disable outputs_to_working_directory in your pulsar destinations, so doing that automatically should help out admins.

bgruening commented 3 years ago

That is still relevant with your latest remote work, isn't it?

natefoo commented 2 years ago

I should maybe create a separate issue for this but we have so many already and it's related - we should automatically enable it for any containerized non-Pulsar destinations unless the admin explicitly configures it otherwise for those destinations.

natefoo commented 2 years ago

Worse, we don't discuss the importance of this option anywhere, nor that it can be set on destinations. In general, I don't think we document that many global options can be set per-destination, and which ones.

martenson commented 2 years ago

I agree. Setting up pulsar/destinations is almost always non-standard and exhaustive options overview would go a long way.