Closed gdevenyi closed 8 years ago
Why the special cases? Parallel doesn't add that much overhead ;-)
Main issue is empty log files until completion since parallel buffers the stdout .
Could solve this with the line buffering options in parallel i guess.
Yeah, we could just use parallel --ungroup
for single command jobs.
If we want to avoid special casing then we should go the whole way I think.
If we default to "--line-buffer" then we get continuous output in all cases (good because log file progress is good), 1-1 output in single-command cases (perfect), and mixed-line-to-line output in parallel cases (bad)
Thoughts?
I'm not that interested in us providing real time log progress... It doesn't work that way in all batch systems anyway, and it's easy enough to do yourself if you need it by emitting to a separate log file, not stdout/stderr. It also seems complex to unwind multiple processes outputs ourselves if we try for realtime output (--line-buffer, if I understand, still seems like it would produce messy output, although perhaps with an intelligent use of --tag/--tagstring this would be doable?)
--line-buffer will indeed mix different outputs together
My reading of --tag/--tagstring is that it's not applicable for the application we're doing, but we could do some testing.
I guess not providing real-time progress makes some sense, since this is a cluster tool anyways.
I special cased the instance of a single input command, but chunksize == 1 should also be handled without gnu parallel. Fix this.