Closed hexylena closed 3 years ago
Fix for Pulsar to transfer extra files in the DESeq2 wrapper - https://github.com/galaxyproject/tools-iuc/pull/3420/files
Pulsar seems to be broken with globs in from_work_dir
outputs: galaxyproject/pulsar#239
Blast wrappers don't work on custom dbs. db file is transferred but the indices for the db are not.. I've found that unless files are explicity called in the commandline, they are not transferred.
There is an infrequent uncaught error transferring files from galaxy to pulsar:
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: 2021-02-24 02:19:36,847 ERROR [pulsar.managers.stateful][[manager=_default_]-[action=preprocess]-[job=2101382]] Failed job preprocessing for job 2101382:
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: Traceback (most recent call last):
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: File "/mnt/pulsar/venv/lib/python3.8/site-packages/pulsar/managers/stateful.py", line 120, in _handling_of_preprocessing_state
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: yield
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: File "/mnt/pulsar/venv/lib/python3.8/site-packages/pulsar/managers/stateful.py", line 111, in do_preprocess
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: preprocess(job_directory, setup_config, self.__preprocess_action_executor, object_store=self.object_store)
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: File "/mnt/pulsar/venv/lib/python3.8/site-packages/pulsar/managers/staging/pre.py", line 19, in preprocess
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: action_executor.execute(lambda: action.write_to_path(path), "action[%s]" % description)
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: File "/mnt/pulsar/venv/lib/python3.8/site-packages/pulsar/managers/util/retry.py", line 42, in execute
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: return _retry_over_time(
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: File "/mnt/pulsar/venv/lib/python3.8/site-packages/pulsar/managers/util/retry.py", line 93, in _retry_over_time
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: return fun(*args, **kwargs)
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: File "/mnt/pulsar/venv/lib/python3.8/site-packages/pulsar/managers/staging/pre.py", line 19, in <lambda>
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: action_executor.execute(lambda: action.write_to_path(path), "action[%s]" % description)
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: File "/mnt/pulsar/venv/lib/python3.8/site-packages/pulsar/client/action_mapper.py", line 465, in write_to_path
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: get_file(self.url, path)
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: File "/mnt/pulsar/venv/lib/python3.8/site-packages/pulsar/client/transport/curl.py", line 93, in get_file
Feb 24 02:19:36 pulsar-mel3 pulsar[2763306]: c.perform()
Feb 24 02:19:37 pulsar-mel3 pulsar[2763306]: pycurl.error: (18, 'transfer closed with 1355529374 bytes remaining to read')
This is not predictable based on the inputs, often the user will submit a new job with the same inputs without this error occurring.
@cat-bro You can tell Pulsar to retry interrupted transfers like this, if you aren't using this give it a shot and see if it helps.
Another issue I've come across a number of times: When Galaxy runs a job locally - it will create all of the expected output files (empty) at the beginning of the job. Pulsar however doesn't do this and sometimes, tools will fail for whatever reason and pulsar can't find the output files to transfer back. This is especially the case where users can select the outputs they want and if the tool doesn't provide it then... In Galaxy it's ok as we have an empty file.
Adding to the above issue - where does output filtering occur in Galaxy? After the job has completed? Or does Galaxy create all the output files and then filter them at the end or does it create only the ones expected by the filters? Pulsar seems to want to send back to Galaxy files that don't exist as they would normally be filtered out (using output filters).
From_working_dir Glob/wildcard in from_work_dir outputs doesn't work #239 and outputs_to_job_dir doesn’t work Pulsar does not find collection files #212 - FIXED with PR #257
No stderr/out viewable, users cannot see logs when tools crash Missing stderr/stdout on pulsar jobs #211 - FIXED with PR #258
Hey @Slugger70 did you (or maybe @cat-bro) verify that collection outputs are valid and not just green and empty? I'm not sure how #257 or #258 fixes #212, but I can imagine how #258 might cause green empty datasets. Although I don't know why collection outputs work so who knows.
Another issue I've come across a number of times: When Galaxy runs a job locally - it will create all of the expected output files (empty) at the beginning of the job. Pulsar however doesn't do this and sometimes, tools will fail for whatever reason and pulsar can't find the output files to transfer back. This is especially the case where users can select the outputs they want and if the tool doesn't provide it then... In Galaxy it's ok as we have an empty file.
For the record this should be fixed by #257 for from_work_dir
outputs. I just did a quick check and, without any of the new PRs, Pulsar doesn't precreate defined, non-from_work_dir
outputs, but it also doesn't force a job failure if the tool fails to create any of the defined outputs, that only happened with from_work_dir
ones.
Remaining issues are on the Admin project board so I don't think we need to keep this growing/mutating issue open indefinitely as that is what the project board is for.
from_work_dir
don't work #239 FIXED IN #257from_work_dir
outputs combined withoutputs_to_job_dir
Galaxy setting doesn’t work #193MetadataFiles
(.bam.bai, blastdb) are not transferred/rewritten. From @slugger70: Blast wrappers don't work on custom dbs. db file is transferred but the indices for the db are not. https://github.com/galaxyproject/pulsar/issues/169