"ERROR ~ fromIndex = -1" during samtools sort step

andreagillespie commented 1 year ago

Description of the bug

I have tried running the pipeline with both bwa and bwamem, either way everything seems to run fine up until the samtools sort step, which does appear to complete on the first sample, but not the second sample regardless of which sample is second. I have tried running samtools sort on the bams created by the pipeline independently and it works just fine for all of my samples when I run it. The error message I am given from the pipeline is: "ERROR ~ fromIndex = -1"

Command used and terminal output

NXF_VER=23.04.1 /config/binaries/nextflow/23.04.1/nextflow run nf-core/nascent -profile singularity \
    --input /scratch/teams/dawson_genomics/Projects/MYC/230427_PROseq/scripts/samplesheet.csv \
    --outdir /scratch/teams/dawson_genomics/Projects/MYC/230427_PROseq/nf-core_nascent \
    --genome GRCh38 \
    --aligner bwamem2 \
    --multiqc_title nf-core_nacent_multiqc \
    --assay_type PROseq \
    -resume

ERROR ~ fromIndex = -1

 -- Check script '/home/agillespie/.nextflow/assets/nf-core/nascent/./workflows/nascent.nf' at line: 222 or see '.nextflow.log' file for more details

Relevant files

nextflow.log samplesheet.csv

System information

Nextflow version: 23.04.1 Hardware: HPC Executor: slurm Container: singularity OS: CentOS 7 Linux nf-core/nascent v2.1.1-g9ff33c7 samtools version: 1.17

edmundmiller commented 1 year ago

Thanks for reporting this! Could you drop your samplesheet in this issue as well?

andreagillespie commented 1 year ago

I've added the sample sheet as well. As I have 2 technical replicates for each sample I did also try merging the fastqs and running the merged fastqs for each of the 12 samples, but I experienced the same error running them that way too. Thank you for your help with this. Cheers!

edmundmiller commented 1 year ago

Awesome, thanks for doing that!

I was going to get around to really digging into this, then I had an epiphany. I think the -'s are breaking it.

3dD-LATE-R1...
3dD-LATE-R1...

to

3dD_LATE_R1
# or
3dDLATE-R1

I can't remember how the groups split exactly but give that a shot and let me know if it at least runs! Then we can figure out where to go from there.

edmundmiller commented 1 year ago

Just leaving a note for myself, this can probably just be fixed by nf-validation and keeping the guard rails on instead of trying to support every possible naming separator, and provide a better error message.

andreagillespie commented 1 year ago

Thank you kindly for your help on this and sorry for the late reply! I changed the dashes to underscores in the fastq and sample names and that did actually seem to fix the issue with samtools sort. The pipeline ran for 1 day and 6.5 hours before it failed again, this time for exceeding running time limit at 8 hours during pints caller. I have been trying to troubleshoot that issue as time allows for the last few days. Since the default time limit is already 240h, which is greater than the 3 days I have allocated, I have not changed that but I would think that should not effect a single process timing out at 8h anyway. Instead I've increased memory by adding the --max_memory 200.GB and --max_cpus 20 options to my command and job submission. I'm currently rerunning the job, but it has not resumed correctly and has started over writing over the files again. So it will be at least a couple more days before I know if that has completed properly. I will be in touch again if I continue to have issues. Cheers!

edmundmiller commented 1 year ago

Sweet, I'm going to go ahead and close this one then, and follow up with #114 so no one else runs into this accidentally!

On the pints time out, yeah that's an issue with these tools 🙃

However, with the flags, those aren't doing what you think they're doing. Check this documentation on how to customize a process specs: https://nf-co.re/nascent/2.1.1/usage#advanced-option-on-process-level

    withName: PINTS_CALLER {
        memory = 200.GB
        cpus = 20
        time = 240.h
    }

That would max out that request (that might not work though because there's overhead etc.). Feel free to ping me on Slack or open another issue if you find another bug!

nf-core / nascent