Open neoformit opened 1 year ago
Rebased against master
Looks good to me, thanks! I suspect this would not result in a "green-but-actually-failed" job since Galaxy will still expect the output to exist in its own finish method, did you happen to include that in testing?
Haven't tested that Nate, @cat-bro is the only one who got this running on an actual Pulsar server.
Unfortunately, I can't figure out why the tests are failing. Apparently this change causes the output file transfer test to fail, which I guess must mean that the source file path doesn't exist in the test case, which doesn't make any sense to me... I spent some time looking at the tests and couldn't figure it out or debug it.
@mvdbeek I don't suppose you'd be able to take a look at this? I assume the explanation lies in https://github.com/galaxyproject/pulsar/blob/master/pulsar/client/test/check.py. Maybe there is a test that checks whether a transfer will retry when the source file is missing on the first attempt...?
I haven't forgotten about this, but I didn't think this was quite the right layer to fix this at. You'd probably want to parameterize this on the job state.
Sorry @mvdbeek I'm not sure what you mean. Only skip retrying if the job has failed and the output file doesn't exist? Isn't job state decided by Galaxy and not Pulsar anyway? :thinking:
Isn't job state decided by Galaxy and not Pulsar anyway?
Pulsar knows the exit code, but you're right, we don't really know when that's a failure or not. I'll try getting back to it.
Thanks! I'm interested to see what you find.
Potentially resolves https://github.com/galaxyproject/pulsar/issues/298. Tested manually on one of our dev pulsar nodes by @cat-bro.