Closed haraldgrove closed 1 year ago
Hi, thanks for your interest in nextNEOpi. Unfortunately, I can not comment about the GATK error you got. So far I have not seen it before and I also couldn't find any useful information on the net.
What happens if you retry manually by doing:
cd /gnome/harald/2022/neoantigens/analysis_results/nextneopi_WGS_hg38_nextflow/41/58b884e0a616d7e02cc9dd1a5d0d27
bash .command.run
If it completes the -resume
option should work. But sometimes for whatever reason nextflow doesn't resume at the supposedly last successfully finished process.
Hi
When I manually ran bash .command.run
in the indicated folder, it finished without any issue. At this point it doesn't feel like a GATK issue, but rather some weird interaction between my data and the nextflow scripts. (As a side note, I have successfully run a WES data set through this part of the pipeline, so I think the install should be ok.)
Unfortunately, when I tried to resume the workflow, it started with the "ScatteredIntervalListToBed, make_uBAM, Bwa " processes. Seemingly not recognizing that the previous run had continued beyond that. I'll have to wait and see if it stops at the CNNscore part again.
Hmmm... this gets difficult to debug/reproduce here.
Some time ago we hit an issue at the interval list creation reported by another user who was also analyzing WGS data.
We made a hotfix patch, that will be included in the next release. I'm not sure if this would also help with the resume
, however I wouldn't hurt:
nextNEOpi_hotfix_20221215.patch.gz
The manual bash .command.run
is exactly doing what nextflow would do when it runs the process, so I still think it might be something (hope fully transient) with GATK. Maybe you still have the .nextflow.log.[x]
from that failed run, so I may check if I can spot something more there. Also a gz archive of that directory (if it fails again) would be helping in finding the root of the issue.
Thanks
I managed to get the pipeline finished by setting the scatter_count
to 1. However, that didn't help with the resume
functionality, it still starts from before the BAM creation, seemingly at the SplitIntervals (SplitIntervals)
step. I managed to delete the previous log files, but if I see the error again, I can provide the log file in case you think you can find anything.
3f34b81da8c155e9b0f905a95aaca9c06710bb4b should resolve the resume behavior.
Feel free to open a new ticket in case v1.4.0 of nextNEOpi
still fails
I just tried to run the workflow on a WGS dataset and got an error message from two of the CNNScore tasks (out of 40 total). I tried to verify the problem by running the failed task directly, both with the singularity image from nextNEOpi and with a local docker image of GATK, but both of them finished without any errors.
Any idea what might be happening here? Error message:
Also, when I tried to rerun the whole job (with -resume and without making any changes to the any of the inputs), the process started at the beginning of the DNA alignment step. Since the log indicated that the Mutect2 step was finished, I assumed it would be able to use the bam files from the previous run?
-Harald