Open gpetretto opened 4 hours ago
If I can't see anything obvious causing the issues at #160, I'll try to hack this feature in as a first draft -- though I'd be surprised if that is the cause of my current woes.
In general this feature would be useful given the peculiarities of parallel file systems. It's probably better to err on the side of waiting for a very long time for downloading files that we expect to appear even if the job errored, i.e., the jfremote outputs.
From a discussion in qtoolkit https://github.com/Matgenix/qtoolkit/pull/43 it emerged that it may be convenient to delay the dowload step after job completion in case of slow NFS on the worker. This should be possible by adding a
delay_download
option for the worker and in that case set theretry_time_limit
when a job goes in theTERMINATED
state.