mouliere-lab / ITSFASTR

A bioinformatic pipeline for ultra-fast analysis of cfDNA using Oxford Nanopore Technologies sequencing.
MIT License
4 stars 2 forks source link

mapping.smk error missing file #5

Open flacchy opened 1 month ago

flacchy commented 1 month ago

Hello @moldovannorbert really sorry for disturbing again, I am running the demo and managed to run the trimmed.smk but I am encountering various errors when running mapping.smk.

Specifically I am running the following code:

cd /scratch/xxx/yyy/zzz/ITSFASTR/workflow/rules/1_preprocessing

snakemake \
         --printshellcmds --keep-going \
         --executor cluster-generic \
         --cluster-generic-submit-cmd 'sbatch --time=60 --nodes=1 --partition=cpu --cpus-per-task=32 --mem=64000 --output=/scratch/xxx/yyy/zzz/DEMO_ITSFASTR/slurm-%j.out' --jobs 500 --max-jobs-per-second 3 --max-status-checks-per-second 5 -s mapping.smk

NOTE: the --cluster option is deprecated in snakemake so I have installed the plugin and change the parameters so that it could run

the error was mentioning latency so I have change the parameter from --max-status-checks-per-second 5 to --max-status-checks-per-second 120

still error:

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided remote nodes: 500
Conda environments: ignored
Job stats:
job                count
---------------  -------
NanoPlot              10
all_mapping            1
flagstat              10
map                   10
mark_duplicates       10
multiqc_mapping        1
total                 42

Select jobs to execute...
Execute 10 jobs...

[Tue Jul 30 10:59:18 2024]
rule map:
    input: /scratch/xxx/yyy/zzz/DEMO_ITSFASTR/Test_demo/trimmed/1_trimming/ONT_dummy_sub-100000_34.fastq.gz.fastq.gz
    output: /scratch/xxx/yyy/zzz/DEMO_ITSFASTR/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_34.fastq.gz.bam
    log: /scratch/xxx/yyy/zzz/DEMO_ITSFASTR/logs/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_34.fastq.gz.log
    jobid: 19
    benchmark: /scratch/xxx/yyy/zzz/DEMO_ITSFASTR/benchmark/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_34.fastq.gz.tsv
    reason: Missing output files: /scratch/xxx/yyy/zzz/DEMO_ITSFASTR/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_34.fastq.gz.bam
    wildcards: sample=ONT_dummy_sub-100000_34.fastq.gz
    resources: tmpdir=<TBD>

            minimap2 -ax map-ont /scratch/xxx/zzz/Human_genome_reference/                      /scratch/xxx/yyy/zzz/DEMO_ITSFASTR/Test_demo/trimmed/1_trimming/ONT_dummy_sub-100000_34.fastq.gz.fastq.gz                      -t 1                      2> /scratch/xxx/yyy/zzz/DEMO_ITSFASTR/logs/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_34.fastq.gz.log |
                     samtools sort                      -@ 1                      -o /scratch/xxx/yyy/zzz/DEMO_ITSFASTR/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_34.fastq.gz.bam 2>> /scratch/xxx/yyy/zzz/DEMO_ITSFASTR/logs/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_34.fastq.gz.log

Submitted job 19 with external jobid 'Submitted batch job 19355729'.

more errors....

on my config file the path to ref genome is

# Path to the reference genome fasta file.
RefGenome: /scratch/xxx/yyy/Human_genome_reference/GRCh38.primary_assembly.genome.fa.gz

any suggestion on how this is happening and how to fix it?

moldovannorbert commented 1 month ago

Dear @flacchy,

No worries, I am here to help. :)

I can't see the error message in your pasted section. But what I see is that in the shell part of the rule only the reference folder is provided for minimap2. This is curious, as you state that the full path to the reference genome is set in the config file. I also noticed that in the shell part the path says /scratch/xxx/zzz/ while in the config /scratch/xxx/yyy/. Is your problem that the zzz and yyy parts are not matching?

The path to the reference genome is directly taken from the congif.yaml RefGenome key by the highlighted line in the map rule: RefPath = config["RefPath"]

rule map: input: MapIn + "{sample}.fastq.gz" output: tmp_dir + metaPath + "/1_mapping/{sample}.bam" params: ref = RefPath threads: config["ThreadNr"] conda: "../../envs/map_ONT_env.yaml" benchmark: config["OutPath"] + "/benchmark/" + ProjDirName + "/1_mapping/{sample}.tsv" log: config["OutPath"] + "/logs/" + ProjDirName + "/1_mapping/{sample}.log" shell: """ minimap2 -ax map-ont {params.ref} \ {input} \ -t {threads} \ 2> {log} | samtools sort \ -@ {threads} \ -o {output} 2>> {log} """

My first suggestion would be to check if you are using the right config, file if you created multiple.

flacchy commented 1 month ago

thanks @moldovannorbert , the difference here" /scratch/xxx/zzz/ while in the config /scratch/xxx/yyy/" is just me changing the paths as this is a public space and just thought to changes it , but can see it can be confusing. this is my config file:

configcopy.txt Please note: the file is actually config.yaml , I had to save it as txt to upload it here

moldovannorbert commented 1 month ago

Dear @flacchy,

The config file looks right. Can you also share the ONT_dummy_sub-100000_34.fastq.gz.log if that contains anything?

flacchy commented 1 month ago
[M::mm_idx_gen::48.889*1.60] collected minimizers
[M::mm_idx_gen::53.803*2.13] sorted minimizers
[M::main::53.803*2.13] loaded/built the index for 194 target sequence(s)
[M::mm_mapopt_update::57.073*2.07] mid_occ = 706; max_occ = 10865
[M::mm_idx_stat] kmer size: 15; skip: 10; is_HPC: 0; #seq: 194
[M::mm_idx_stat::57.844*2.05] distinct minimizers: 100159079 (38.75% are singletons); average occurrences: 5.545; average spacing: 5.581
[M::worker_pipeline::63.129*2.40] mapped 100000 sequences
[M::main] Version: 2.1.1-r341
[M::main] CMD: minimap2 -ax map-ont -t 8 /scratch/prj/hab/Human_genome_reference/GRCh38.primary_assembly.genome.fa.gz /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo/trimmed/1_trimming/ONT_dummy_sub-100000_34.fastq.gz.fastq.gz
[M::main] Real time: 63.373 sec; CPU: 151.512 sec
[bam_sort_core] merging from 0 files and 8 in-memory blocks...
moldovannorbert commented 1 month ago

Dear @flacchy I don't see any errors here either. Can you please specify what is the error you encounter?

flacchy commented 1 month ago

running mapping.smk doesn't finish the run. It crashes giving multiple errors of missing files . As mentioned above I don't know if it is because the new version of snakemake works differently. Not sure if this can help but here are some more info from full logs for one of the multiple failing runs :

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided remote nodes: 500
Conda environments: ignored
Job stats:
job                count
---------------  -------
NanoPlot              10
all_mapping            1
flagstat              10
map                    7
mark_duplicates       10
multiqc_mapping        1
total                 39

Select jobs to execute...
Execute 16 jobs...

[Tue Jul 30 10:22:58 2024]
rule map:
    input: /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo/trimmed/1_trimming/ONT_dummy_sub-100000_41.fastq.gz.fastq.gz
    output: /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_41.fastq.gz.bam
    log: /scratch/prj/hab/Flavia/DEMO_ITSFASTR/logs/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_41.fastq.gz.log
    jobid: 15
    benchmark: /scratch/prj/hab/Flavia/DEMO_ITSFASTR/benchmark/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_41.fastq.gz.tsv
    reason: Missing output files: /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_41.fastq.gz.bam
    wildcards: sample=ONT_dummy_sub-100000_41.fastq.gz
    resources: tmpdir=<TBD>

            minimap2 -ax map-ont /scratch/prj/hab/Human_genome_reference                      /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo/trimmed/
1_trimming/ONT_dummy_sub-100000_41.fastq.gz.fastq.gz                      -t 1                      2> /scratch/prj/hab/Flavia/DEMO_ITSFASTR/logs/Test
_demo/trimmed/1_mapping/ONT_dummy_sub-100000_41.fastq.gz.log |
                     samtools sort                      -@ 1                      -o /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo/trimmed/1_mapping
/ONT_dummy_sub-100000_41.fastq.gz.bam 2>> /scratch/prj/hab/Flavia/DEMO_ITSFASTR/logs/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_41.fastq.gz.log

[Tue Jul 30 10:22:58 2024]

........ more logs and errors .... showing only end of file now ....

       (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job mark_duplicates since they might be corrupted:
/scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_34.fastq.gz.bam
Removing output files of failed job mark_duplicates since they might be corrupted:
/scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_33.fastq.gz.bam
Removing output files of failed job mark_duplicates since they might be corrupted:
/scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo/trimmed/1_mapping/ONT_dummy_sub-100000_38.fastq.gz.bam
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-07-30T102258.138724.snakemake.log
WorkflowError:
At least one job did not complete successfully.

it seems that snakemake is submitting all jobs at once rather than starting the next part and starting for example samtools or nano-plot only after minimap have finishes. I am trying to downgrade snakemake to see if the deprecated version where --cluster works will make a difference .

moldovannorbert commented 1 month ago

One thing that might cause an error (not sure if this is the cause here) is that you are useing file names with the .fastq.gz extension as sample names. Can you check this first? In the sample sheet please use only the sample name (eg. ONT_dummy_sub-100000_34). If this doesn't solve the issue, can you share the .snakemake/log/2024-07-30T102258.138724.snakemake.log if it's not empty?

flacchy commented 1 month ago

thanks again for your help @moldovannorbert .

ok, I am just re-running mapping.smk. Just to re-check , here is the sample sheet image

Trimming output image

command to run mapping.smk

snakemake    --printshellcmds --keep-going          --executor cluster-generic          --cluster-generic-submit-cmd 'sbatch --time=60 --nodes=1 --partition=cpu --cpus-per-task=32 --mem=64000 --output=/scratch/prj/hab/Flavia/DEMO_ITSFASTR/slurm-%j.out' --jobs 500 --max-jobs-per-second 3 --max-status-checks-per-second 120 -s mapping.smk

Error in red : while running

image

the text:

Error in rule NanoPlot:
    message: For further error details see the cluster/cloud log and the log files of the involved rule(s).
    jobid: 8
    input: /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo_versionAugust/trimmed/1_mapping/ONT_dummy_sub-100000_37.bam
    output: /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo_versionAugust/trimmed/1_mapping_quality/ONT_dummy_sub-100000_37
    log: /scratch/prj/hab/Flavia/DEMO_ITSFASTR/logs/Test_demo_versionAugust/trimmed/1_mapping_quality/ONT_dummy_sub-100000_37.log (check log file(s) for error details)
    conda-env: /scratch/prj/hab/Flavia/ITSFASTR/workflow/rules/1_preprocessing/.snakemake/conda/80b2dd96334bda752de9987b745c4ed7_
    shell:

            NanoPlot --bam /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo_versionAugust/trimmed/1_mapping/ONT_dummy_sub-100000_37.bam                      -o /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo_versionAugust/trimmed/1_mapping_quality/ONT_dummy_sub-100000_37                      --raw                      --alength                      -t 1                      --huge                      2>> /scratch/prj/hab/Flavia/DEMO_ITSFASTR/logs/Test_demo_versionAugust/trimmed/1_mapping_quality/ONT_dummy_sub-100000_37.log

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
    external_jobid: Submitted batch job 19763414

....more text here ...
[Tue Aug  6 10:49:34 2024]
Finished job 4.
17 of 42 steps (40%) done
[Tue Aug  6 10:49:55 2024]
Finished job 12.
18 of 42 steps (43%) done
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-08-06T104523.456938.snakemake.log
WorkflowError:
At least one job did not complete successfully.

Full log error: attached here 2024-08-06T104523.456938.snakemake.log

thanks again for your help

moldovannorbert commented 1 month ago

Ok, this is getting us closer. You are getting errors in the mark_duplicates and flagstat rules. This suggests that there is something wrong with the mapping.

  1. are there any non-zero byte bams created from the runs?
  2. Can you send this log if it contains information: /scratch/prj/hab/Flavia/DEMO_ITSFASTR/logs/Test_demo_versionAugust/trimmed/2_mark_duplicates/ONT_dummy_sub-100000_37.log and the log for the mapping for the same file?
flacchy commented 1 month ago

so the /scratch/prj/hab/Flavia/DEMO_ITSFASTR/logs/Test_demo_versionAugust/trimmed/2_mark_duplicates/ONT_dummy_sub-100000_37.log is empty

the mapping log here : /scratch/prj/hab/Flavia/DEMO_ITSFASTR/logs/Test_demo_versionAugust/trimmed/1_mapping/ONT_dummy_sub-100000_37.log has only these lines

[M::mm_idx_gen::77.353*0.99] collected minimizers
[M::mm_idx_gen::112.680*0.99] sorted minimizers
[M::main::112.680*0.99] loaded/built the index for 194 target sequence(s)
[M::mm_mapopt_update::115.404*0.99] mid_occ = 706; max_occ = 10865
[M::mm_idx_stat] kmer size: 15; skip: 10; is_HPC: 0; #seq: 194
[M::mm_idx_stat::116.161*0.99] distinct minimizers: 100159079 (38.75% are singletons); average occurrences: 5.545; average spacing: 5.581
[M::worker_pipeline::145.938*0.99] mapped 100000 sequences
[M::main] Version: 2.1.1-r341
[M::main] CMD: minimap2 -ax map-ont -t 1 /scratch/prj/hab/Human_genome_reference/GRCh38.primary_assembly.genome.fa.gz /scratch/prj/hab/Flavia/DEMO_ITSFASTR/Test_demo_versionAugust/trimmed/1_trimming/ONT_dummy_sub-100000_37.fastq.gz
[M::main] Real time: 146.186 sec; CPU: 144.820 sec

It seems that only these were created image

In the benchmarking folder I can see the following for mapping: image

And for the dummy sample 37: /scratch/prj/hab/Flavia/DEMO_ITSFASTR/benchmark/Test_demo_versionAugust/trimmed/1_mapping/ONT_dummy_sub-100000_37.tsv image

moldovannorbert commented 2 weeks ago

Dear @flacchy,

Sorry for the late response. I looked through the logs you sent, but unfortunately I could not figure out why some jobs fail. I was also unable to reproduce it on our cluster. Can you please send me what's in one of the failed slurm-%j.out logs?