Closed gilfreund closed 5 years ago
Hi @gilfreund,
I don't think that treepath = "null"
is interpreted literally. Indeed, in the code, there is an 'if' statement that, when treepath = "null"
, runs 'alpha_diversity.py` without the Newick tree.
Could you please share with me your command line instruction, and the .command.run
file?
Many thanks, Alessia
From look at the nextflow log, I think now that alphaDiversity isn't even starting.
nextflow.log
In the log, line 382, I see that nexflow is trying to stage a file:
DEBUG nextflow.file.FilePorter - Copying foreign file /efs/emendo/yamp/tests/batch/null to work dir: s3://emendobio/yamp/stage/95/15bf8822c48b6f717360186357c66d/null
and then alphaDiversity fails:
`Sep-13 09:54:35.287 [Actor Thread 41] ERROR nextflow.processor.TaskProcessor - Error executing process > 'alphaDiversity (1)'
Caused by: Can't stage file file:///efs/emendo/yamp/tests/batch/null -- file does not exist
Tip: when you have fixed the problem you can continue the execution adding the option -resume
to the run command line`
So I probably have a configuration error, which is preventing staging, or something is missing from a previous step. At any case, a .command.run is not even created.
You may have hit a bug. Could you please share with me your command line instruction? Can you also send me some more information/data to replicate your issue?
Nextflow is version 19.07.0.5106
The command line is:
nextflow run YAMP.nf --reads1 /efs/emendo/yamp/data/ERR011089_1.fastq.gz --reads2 /efs/emendo/yamp/data/ERR011089_2.fastq.gz --prefix META_ERR011089 --mode complete -bucket-dir s3://emendobio/yamp
I am including YAMP.nf and nextflow.config file (in case I have made a mistake in the configurations, such as paths), as well as the Dockerfile I use (mainly to set the userid: yamp.zip
I configure an AWS batch definition to handle the volumes I thought I might need (such as the scratch space, if needed)
I used the files you pointed to in you documentation, just if case we have some corruption on our side.
The work bucket is an empty folder in one of our buckets.
Let me know if any additional information is required.
Thanks Gil
I think I have found the root cause. When awsbatch is the executor, nexflow will try and stage files from S3. The S3 cp command then returns an error as there is no null file.
I used the hello world example from the nextflow site with some changes:
params.treepath="null"
hello_txt = Channel.fromPath(params.treepath)
process splitLetters {
input:
file(hello_file) from hello_txt
output:
file 'chunk_*' into letters mode flatten
"""
cat ${hello_file} | split -b 6 - chunk_
"""
}
process convertToUpper {
input:
file x from letters
output:
stdout result
"""
if [ $params.treepath == null ]
then
echo params.treepath == null
else
echo "using param.treepath $params.treepath"
cat $x | tr '[a-z]' '[A-Z]'
fi
"""
}
result.println { it.trim() }
If I run it with a local executor and don't provide a treepath on the command line, it will fail on the seconds script in which I did not provide for handling on the null vaule:
N E X T F L O W ~ version 19.07.0
Launching `./hello1.nf` [stoic_austin] - revision: 6f4900cde8
executor > local (1)
[56/83a7bd] process > splitLetters (1) [100%] 1 of 1, failed: 1 ✘
[- ] process > convertToUpper -
Error executing process > 'splitLetters (1)'
Caused by:
Missing output file(s) `chunk_*` expected by process `splitLetters (1)`
Command executed:
cat null | split -b 6 - chunk_
Command exit status:
0
Command output:
(empty)
Command error:
cat: null: No such file or directory
Work dir:
/home/ec2-user/nextflow/work/56/83a7bd87813f3cb35d8f1ad7db10bc
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
When I do the same with awsbatch as the executor it will fail in the staging, same as the issue I encountered.
N E X T F L O W ~ version 19.07.0
Launching './hello1.nf' [awesome_turing] - revision: 6f4900cde8
[- ] process > splitLetters -
[- ] process > splitLetters -
[- ] process > convertToUpper -
Error executing process > 'splitLetters (1)'
Caused by:
Can't stage file file:///home/ec2-user/nextflow/null -- file does not exist
Tip: view the complete command output by changing to the process work dir and en tering the command 'cat .command.out'
I am new to nextflow, so I am not sure how to handle this scenario, but I will research and update.
I am very new to AWSbatch (I have honestly used it only once to test YAMP), so I am afraid I cannot be of much help. You could try asking in the: Nextflow Gitter, there you would find surely some help.
Thanks for keeping me posted!
There is a gap between local and awsbatch handling of optional file (See : https://github.com/nextflow-io/nextflow/issues/1233)
I following a workaround derived from the discussion there (See: https://gitmemory.com/issue/nextflow-io/nextflow/1233/515925532) and made the following change got YAMP.nf alphaDiversity
step:
process alphaDiversity {
publishDir workingdir, mode: 'move', pattern: "*.{tsv}"
input:
file(infile) from toalphadiversity
opt_file = params.treepath
file opt from opt_file
output:
file ".log.8" into log8
file "${params.prefix}_alpha_diversity.tsv"
when:
params.mode == "characterisation" || params.mode == "complete"
script:
def treepath = opt.name != 'null' ? "--treepath $opt" : ''
I pass the params.treepath
as a variable which stops the aws engine from trying to stage it. In a local run there is no staging, so there is no adverse effect. I then pass the variable to the script.
Another suggestion I saw was to create an empty file called something like no_file
and pointing treepath
at it (See: https://github.com/nextflow-io/nextflow/issues/1233#issuecomment-513121438).
Note the aws batch did report and issue of a missing file, but it was masked from nextflow, and it completed successfully, as far as I can see.
I think a comment in the wiki may be in place, but not a code change, as I did not have a change to check this with other executors, and as nextflow will most likely address this in a later version.
Hi @gilfreund, I have added it to the Troubleshooting.
Thanks a lot!
Hi, Running in an AWS Batch environment the fails with the message: