Closed IdoBar closed 3 years ago
Seem to work alright in a fresh folder and nextflow
v20.10.0.5430
Hi @IdoBar thanks for the open and quick close.
It would be useful for us to know, however, whether when you tried running the first time with -r 2.2.2
(when you had got the error), had you run nextflow pull eager -r 2.2.2
already?
This has come up before, and if you had not run pull before, I think we will need to improve our documentation to make it clear that if you have not run a specific version before that you must use the pull
command first. So your feedback would be useful for us to decide whether we need to clarify this or not!
Hi @jfy133,
I think that I tried to run it without pulling first.
Regardless of this, I still can't run the workflow (running on QRIS Awoonga)
It's submitting the jobs to the cluster, but all the jobs get terminated instantly with exit status 1
and nothing informative in the logs (see example below)
-[nf-core/eager] Pipeline completed with errors-
Error executing process > 'fastqc (D11_L1)'
Caused by:
Process `fastqc (D11_L1)` terminated with an error exit status (1)
Command executed:
fastqc -t 1 -q D11_1.sampled1M.trimmed.fq.gz D11_2.sampled1M.trimmed.fq.gz
rename 's/_fastqc\.zip$/_raw_fastqc.zip/' *_fastqc.zip
rename 's/_fastqc\.html$/_raw_fastqc.html/' *_fastqc.html
Command exit status:
1
Command output:
(empty)
Command wrapper:
########################### Execution Started #############################
JobId:507611.awonmgr2
UserName:ibar
GroupName:qris-gu
ExecutionHost:aw128
###############################################################################
########################### Job Execution History #############################
JobId:507611.awonmgr2
UserName:ibar
GroupName:qris-gu
JobName:nf-fastqc_D11_L
SessionId:72595
ResourcesRequested:mem=4096mb,ncpus=1,place=free,walltime=04:00:00
ResourcesUsed:cpupercent=0,cput=00:00:00,mem=0kb,ncpus=1,vmem=0kb,walltime=00:00:03
QueueUsed:Short
AccountString:qris-gu
ExitStatus:1
###############################################################################
Work dir:
/30days/ibar/data/Dingo/Dingo_aDNA_NF_process_20_12_2020/90/69516d849eaecd9557fed831af6a45
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out
When I edit the .command.run
file on the work dir to keep only the commands from the nxf_stage()
and nxf_main()
and submit this to the cluster it runs alright and produces the correct output.
My guess is that the jobs are getting killed somewhere in the process management functions (nxf_tree()
, nxf_stat()
, nxf_trace()
, nxf_mem_watch()
, etc.).
I can create a new issue if needed, but I'll scan through the archived ones to see if I missed something.
These are the only errors I could find in the log file:
Dec.-21 00:42:19.210 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[jobId: 507611.awonmgr2; id: 5; name: fastqc (D11_L1); status: COMPLETED; exit: 1; error: -; workDir: /3
0days/ibar/data/Dingo/Dingo_aDNA_NF_process_20_12_2020/90/69516d849eaecd9557fed831af6a45 started: 1608475334333; exited: 2020-12-20T14:42:15.474194Z; ]
Dec.-21 00:42:19.226 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump output of process 'fastqc (D11_L1)' -- Cause: java.nio.file.NoSuchFileException: /30days/ibar/data/Dingo/Dingo_aDNA_NF_pr
ocess_20_12_2020/90/69516d849eaecd9557fed831af6a45/.command.out
Dec.-21 00:42:19.229 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump error of process 'fastqc (D11_L1)' -- Cause: java.nio.file.NoSuchFileException: /30days/ibar/data/Dingo/Dingo_aDNA_NF_pro
cess_20_12_2020/90/69516d849eaecd9557fed831af6a45/.command.err
Dec.-21 00:42:19.264 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'fastqc (D11_L1)'
Thanks, Ido
Hey @IdoBar
Ok good to know! Then I think we can add that to the troubleshooting docs.
From what you describe, it seems like you could be using a misconfigured profile. Could you maybe send the whole .nexflow.log
file, the command you used, and the custom profile (if you used)?
Then we can maybe identify the problem. Typically instant crashes are from stuff like not being able to find a container or being sent to the wrong partition
Thanks for your reply @jfy133,
Please see the Dingo_samples.tsv
file, .json
parameters file, my custom awoonga.config
and the log file (.nextflow.log
) in the attached zip file.
The command that I used is:
nextflow run nf-core/eager \
-r 2.2.2 \
-params-file Dingo_aDNA.CanFam3.1.bwaaln.gatkug.json \
-c /home/ibar/.nextflow/awoonga.config
Please note that nextflow
is failing the same way when running the test set (-profile test_tsv
), so it must be something in the config of singularity/executor.
Many thanks, Ido
Hi @IdoBar thanks for the info. I've edited your post to remove the log files now as it included a personal token.
It looks like it's not a container issue as I first thought.
I've looked through the log fie and noticed the following:
Dec.-21 00:42:19.226 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump output of process 'fastqc (D11_L1)' -- Cause: java.nio.file.NoSuchFileException: /30days/ibar/data/Dingo/Dingo_aDNA_NF_process_20_12_2020/90/69516d849eaecd9557fed831af6a45/.command.out
Dec.-21 00:42:19.229 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump error of process 'fastqc (D11_L1)' -- Cause: java.nio.file.NoSuchFileException: /30days/ibar/data/Dingo/Dingo_aDNA_NF_process_20_12_2020/90/69516d849eaecd9557fed831af6a45/.command.err
This would suggest somehow that there is maybe either a missing directory or permissions issue somehow? These are Nextflow specific errors rather than nf-core/eager, as those two files are what Nextflow writes for you.
I lookied in your profile and I also see that you have the boolean for scratch in quotes ('true'
rather than true
). I don't know if that would make a difference but maybe something to try. You could also try temporarily explicitly specifying the scratch directory as described here, to see if that fixes the issue.
Thanks, I tried it and it still fails.
Also there seems to be no issues producing the rest of the intermediate files in the same folder (.command.run
, .command.sh, .command.log
, .command.begin`).
Any other ideas?
Just to rule that it's a nextflow issue, I'll look for a quick easy workflow to test-run.
Thanks, Ido
Not for the moment, it's definitely a Nextflow issue rather than nf-core though (lucky for me :sweat_smile:).
You could also ask on the Nextflow gitter: https://gitter.im/nextflow-io/nextflow.
I'll let you know if I think of anything else!
Thanks @jfy133, I figured this out...
Apparently my .bashrc
was loading the system /etc/bashrc
, which in turn loaded a series of system-specific scripts from folder /etc/profile.d/
. Seems like one of those scripts was breaking the workflow.
I removed those lines from my .bashrc
and now the workflow runs well (so far).
I'll dig in further to see exactly which of those scripts is causing the error and why.
EDIT
This is the offensive script (/etc/profile.d/00-modulepath.sh
):
[ -z "$MODULEPATH" ] && [ "$(readlink /etc/alternatives/modules.sh)" = "/usr/share/lmod/lmod/init/profile" -o -f /etc/profile.d/z00_lmod.sh ] && export MODULEPATH=/etc/modulefiles:/usr/share/modulefiles
Many thanks for your help, Ido
Glad to hear!
Check Documentation
I have checked the following places for your error:
Description of the bug
The most recent version (v2.2.2) is not identified by
nextflow
Steps to reproduce
Steps to reproduce the behaviour:
nextflow run nf-core/eager -r 2.2.2 -profile test_tsv
Also when running
nextflow pull nf-core/eager
, this is the output:Expected behaviour
Should pull and run the most recent version
Log files
This is the content of
.nextflow.log
file:Have you provided the following extra information/files:
.nextflow.log
file2.2.2
-- Make sure that it exists in the remote repositoryhttps://github.com/nf-core/eager
System
Nextflow Installation