Open paulstretenowich opened 4 years ago
Hi Paul,
apologies for the late reply.
About mergeBam
not always working, I suspect it is related to the open issue #48. I am going to fix it soon.
About markDup
, it would be useful to get the process logs and Nextflow logs for the pipeline run. Another test you could do, would be to manually launch the .command.run
script from within the process folder and see if that works. Also, if you run the pipeline again does the problem still arise? It looks like a weird behavior...
Best, Emilio
Hi Emilio,
Yes, the mergeBam
issue is related to #48, that's why I haven't given you much more information about.
About markDup
, if I run .command.run
out of the pipeline I have the same issue. If I run .command.sh
it's working well. If I re-run the pipeline I also have the same issue, however, sometimes it works without changing anything.
Here are the logs from a run: nextflow.log command.err.txt command.log command.run.txt command.sh.txt
Thanks, Paul
HI Paul,
I could not find anything useful in the logs.
Did you run the .command.run
and .command.sh
locally or submitted via slurm
? I am wondering whether the problem is running within a submitted job vs locally or it is the .command.run
script that has some incompatibilities with your system.
Best, Emilio
Hi Emilio,
When I use the pipeline I run it with slurm but when I tried manually it was without slurm. In both cases it was with the singularity image.
Running .command.sh
either inside the container or outside worked but when it comes to run .command.run
I have the timeout/0% CPU usage issue with or without slurm.
Thanks, Paul
Hi Paul,
could you please try running the pipeline with the included small test dataset and the markdup
profile? E.g.:
nextflow run grape-nf -profile markdup -with-singularity
Does the problem occur also in this case?
Best, Emilio
Hi Emilio,
Testing with markdup
profile on test dataset worked without issue.
Thanks, Paul
Hi Paul,
thanks, that does not help much.
I just realized you are not using the latest version of Nextflow. Any chance you can make a test using that version?
nextflow -self-update
N E X T F L O W
version 19.10.0 build 5170
created 21-10-2019 15:07 UTC (17:07 CEST)
cite doi:10.1038/nbt.3820
http://nextflow.io
If the problem persists, I would then suggest you to run the hanging job via .command.run
and inspect the process tree to see what's going on. You could use top
or ps
to check. In case you need help, please just send me the output of the ps -faux
command.
Another test would be to run the pipeline adding trace.enabled = false
to your local nextflow.config
file and see whether the problems comes from that. I am not sure that's the case as the test dataset runs without issues.
Best, Emilio
Hi Emilio,
After updating nextflow it seems to solve the issue only if I run it locally, when I'm using slurm I still have the samne issue. EDIT: The update solved the issue for 2 samples but for the other 2 even locally the issue remains.
You can find what's going on at the markdup step on the htop screenshot attached if that helps and here is the corresponding ps-faux.txt output.
Changing the value of trace.enabled
to false
doesn't change anything...
Thanks, Paul
Hi Paul,
it's a weird issue and it's hard to tell what's the cause. Maybe removing some of the complexity would help. Could you try running it without Singularity
(e.g. with environment modules
or conda
)?
The best would be to find a minimal dataset for which we can reproduce the issue.
Best, Emilio
Hi Emilio,
Just to update you, I'm installing all the tools required for the pipeline to run and I will test without using singularity as you suggested. I will tell you if that changes anything with the issue.
Thanks, Paul
Hi Paul,
any news regarding this issue?
Best, Emilio
Hi Emilio,
I moved to another cluster and that specific issue is not happening. It might be related to the infrastructure of the first cluster I tried. I'm waiting for an update of the file system and I will test again hoping it'll work then. I'll keep you posted on that.
On the other cluster I'm running the pipeline the only remaining issue is the mergeBam which you are fixing.
Thanks, Paul
Hi Paul,
thanks for the update.
I'm closing this for now. Please feel free to reopen it again after the file system update if needed.
Best, Emilio
Hi,
I'm using the pipeline as part of IHEC, with the following versions:
The pipeline itself is running well except the mergeBam step (not always working). When it comes to the markdup step it's taking very long time to end (I tried allowing up to 3 days) and ending with TIMEOUT. I noticed that the sambamba cmd is started but "stuck" and using 0% CPU (monitoring with htop). Then, I tried the sambamba cmd defined inside
.command.sh
outside the container (it worked) and inside the container (it worked too). I don't know what's happening there. If you need me to add some logs please tell me.Thanks, Paul