galaxyproject / pulsar

Distributed job execution application built for Galaxy
https://pulsar.readthedocs.io
Apache License 2.0
37 stars 49 forks source link

Fail jobs when output files exist but stage out fails #339

Open natefoo opened 10 months ago

natefoo commented 10 months ago

With #258 we no longer fail jobs when work dir outputs cannot be staged out under any condition. This is good for getting back stdout/stderr when the tool fails for valid reasons and fails to produce workdir output. However, it also masks stage out failures, meaning that successful jobs that fail to stage out data will show up as green/ok in Galaxy but with zero length outputs.

This change still allows the job to succeed if workdir outputs are missing (more correctly, error handling still falls to Galaxy), but if staging out fails due to transport errors, then the job will be failed.

In my testing you still get back both job and tool stdout and stderr.

Draft because I'd like to have a custom client message on the Galaxy side for this if possible. Right now you get our old friend "Remote job server indicated a problem running or monitoring this job."