ufal / treex

Treex NLP framework
33 stars 6 forks source link

Wrong file name for fatal error logs in parallel execution? #30

Closed tuetschek closed 8 years ago

tuetschek commented 8 years ago

Recently, when I was processing something on the cluster and one of the jobs encountered a fatal error, I have been getting errors similar to this one:

TREEX-INFO:  4832.295:  Fatal error found in job 60, document 34331
grep: ./003-cluster-run-tMIXZ/error/doc0034438.stderr: No such file or directory
tail: cannot open ‘./003-cluster-run-tMIXZ/error/doc0034438.stderr’ for reading: No such file or directory

I had to go on and find the error, and found out that the error log is actually located in ./003-cluster-run-tMIXZ/output/doc0034438.stderr. Is it supposed to be there?

I have located the code responsible for finding the error log here, and it seems very strange – what's with the -d test? Isn't it supposed to be a -e or -f?