Closed jelmervdl closed 4 years ago
Well, that's strange. b64filter
is actually hanging when it tries to put a line count on the queue and the queue is full, as it will be if there's been nothing to consume the input.
At this point the process has exited but has not been waited for, it still exists and is found by os.FindProcess
. The cmd.ProcessState
isn't filled in yet because we haven't waited. We haven't waited for it because we must wait after consuming its output, which of course we haven't done because the process isn't making any.
Oddly, writing to the input of the process when it is like this does not produce an error. I thought it would (see around https://github.com/paracrawl/b64filter/blob/master/b64filter.go#L245). That might be a Go bug.
Need to figure out how to check there that the process isn't a zombie.
My initial idea was to just put the reading & queueing of docs in a goroutine as well and have the main thread only start those reading & writing goroutines and then proc.wait on the process to quit. If it does, and it wasn't happy, cancel/abort the reading and try to flush the writing and then fail appropriately?
But I am not experienced enough with Go to just go on and implement it instead of reporting the bug.
Actually, I think it's simpler than that. Before we used an output buffer, we had to deal with the behaviour of cmd.Wait
which wanted all of the output to be consumed first. That is no longer the case, according to https://golang.org/pkg/os/exec/#Cmd.Wait. So I think we can get rid of the done
signalling channel on the output goroutine entirely, as well as the code that waits for the output queue to drain, and simply Wait
normally.
Try that on for size.
go get -u github.com/paracrawl/b64filter
[cs-vand1@login-e-12 ~]$ gzip -cd ~/hieu/nn/0/1/sentences.gz | b64filter parallel --halt 2 -j 16 --pipe -k -l 1024 false
parallel: This job failed:
false
2020/03/11 10:04:15 b64filter.go:257: error waiting for command: exit status 1
👍
turns out, the silly dance was there for a reason...
I've used the code in #4 to translate about 2/3 of the languages in Hieu's data and it seems to hold up. Any issues I've had with it I can now relate to other causes.
Still, the patch is only half a solution. b64filter will still break if the captured program doesn't fail hard with a non-zero exit code, but quietly produces incorrect output.
I've been able to reproduce it with this command:
parallel
will indicate thatfalse
failed, but b64filter won't return and hang indefinitely.It only happens for small sets of input sentences. If I try this with a sentences.gz from English, it will fail and stop on a pipe error.