nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
410 stars 417 forks source link

Problem with intervals file #1690

Open Eduardo-Auer opened 1 month ago

Eduardo-Auer commented 1 month ago

Description of the bug

I think the problem might be the format (using .list or .bed formats) that I used to use in intervals parameter and I do not know the correct format (even after reading the help in sarek page). I need to use these intervals because my data are WES and I created an bed file from Illumina Exome Targeted Regions bed file. I will upload the files (my google drive link) that I used as intervals (exome_intervals.list, exome_intervals.bed) and the Illumina Exome Targeted Regions bed file (Illumina_Exome_TargetedRegions_v1.2.hg38.bed). Could someone help me?

Command used

nextflow run nf-core/sarek -r 3.4.4 -profile conda --input /home/eduardo/gargantua/samplesheet.csv -work-dir /home/eduardo/gargantua/temp --step mapping --aligner bwa-mem2 --concatenate_vcfs false --joint_germline false --fasta /home/eduardo/sequencing_resources/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna --fasta_fai /home/eduardo/sequencing_resources/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.fai --wes --intervals /home/eduardo/gargantua/exome_intervals.list --dbsnp /home/eduardo/sequencing_resources/GRCh38.dbSNP156_chr_included.vcf.gz --dbsnp_tbi /home/eduardo/sequencing_resources/GRCh38.dbSNP156_chr_included.vcf.gz.tbi --known_indels /home/eduardo/sequencing_resources/Homo_sapiens_assembly38.known_indels.vcf.gz,/home/eduardo/sequencing_resources/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz --known_indels_tbi /home/eduardo/sequencing_resources/Homo_sapiens_assembly38.known_indels.vcf.gz.tbi,/home/eduardo/sequencing_resources/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz.tbi --skip_tools bcftools,multiqc,fastqc --genome null --igenomes_ignore --outdir home/eduardo/gargantua/results/

Terminal output

This example is for only .list file and the same error happens with .bed file

[](N E X T F L O W   ~  version 24.04.4

Launching `https://github.com/nf-core/sarek` [exotic_solvay] DSL2 - revision: 5cc30494a6 [3.4.4]

WARN: Access to undefined parameter `monochromeLogs` -- Initialise it to a default value eg. `params.monochromeLogs = some_value`

------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
      ____
    .´ _  `.
   /  |\`-_ \      __        __   ___     
  |   | \  `-|    |__`  /\  |__) |__  |__/
   \ |   \  /     .__| /¯¯\ |  \ |___ |  \
    `|____\´

  nf-core/sarek v3.4.4-g5cc3049
------------------------------------------------------
Core Nextflow options
  revision             : 3.4.4
  runName              : exotic_solvay
  launchDir            : /home/eduardo/gargantua
  workDir              : /home/eduardo/gargantua/temp
  projectDir           : /home/eduardo/.nextflow/assets/nf-core/sarek
  userName             : eduardo
  profile              : conda
  configFiles          : 

Input/output options
  input                : /home/eduardo/gargantua/samplesheet.csv
  outdir               : home/eduardo/gargantua/results/

Main options
  wes                  : true
  intervals            : /home/eduardo/gargantua/exome_intervals.list
  skip_tools           : bcftools,multiqc,fastqc

Preprocessing
  aligner              : bwa-mem2

Reference genome options
  genome               : null
  dbsnp                : /home/eduardo/sequencing_resources/GRCh38.dbSNP156_chr_included.vcf.gz
  dbsnp_tbi            : /home/eduardo/sequencing_resources/GRCh38.dbSNP156_chr_included.vcf.gz.tbi
  fasta                : /home/eduardo/sequencing_resources/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
  fasta_fai            : /home/eduardo/sequencing_resources/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.fai
  known_indels         : /home/eduardo/sequencing_resources/Homo_sapiens_assembly38.known_indels.vcf.gz,/home/eduardo/sequencing_resources/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
  known_indels_tbi     : /home/eduardo/sequencing_resources/Homo_sapiens_assembly38.known_indels.vcf.gz.tbi,/home/eduardo/sequencing_resources/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz.tbi
  igenomes_ignore      : true

Generic options
  validationLenientMode: true

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use nf-core/sarek for your analysis please cite:

* The pipeline
  https://doi.org/10.12688/f1000research.16665.2
  https://doi.org/10.1093/nargab/lqae031
  https://doi.org/10.5281/zenodo.3476425

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://github.com/nf-core/sarek/blob/master/CITATIONS.md
------------------------------------------------------
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM1_INDEX                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM2_INDEX                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:DRAGMAP_HASHTABLE                                  -
[-        ] NFCORE_SAREK:PREPARE_GENOME:GATK4_CREATESEQUENCEDICTIONARY                     -
[-        ] NFCORE_SAREK:PREPARE_GENOME:MSISENSORPRO_SCAN                                  -
[-        ] NFCORE_SAREK:PREPARE_GENOME:SAMTOOLS_FAIDX                                     -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_BCFTOOLS_ANNOTATIONS                         -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_DBSNP                                        -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_GERMLINE_RESOURCE                            -
executor >  local (1)
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM1_INDEX                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM2_INDEX                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:DRAGMAP_HASHTABLE                                  -
[-        ] NFCORE_SAREK:PREPARE_GENOME:GATK4_CREATESEQUENCEDICTIONARY                     -
[-        ] NFCORE_SAREK:PREPARE_GENOME:MSISENSORPRO_SCAN                                  -
[-        ] NFCORE_SAREK:PREPARE_GENOME:SAMTOOLS_FAIDX                                     -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_BCFTOOLS_ANNOTATIONS                         -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_DBSNP                                        -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_GERMLINE_RESOURCE                            -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_KNOWN_SNPS                                   -
executor >  local (3)
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM1_INDEX                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM2_INDEX                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:DRAGMAP_HASHTABLE                                  -
[-        ] NFCORE_SAREK:PREPARE_GENOME:GATK4_CREATESEQUENCEDICTIONARY                     -
[-        ] NFCORE_SAREK:PREPARE_GENOME:MSISENSORPRO_SCAN                                  -
[-        ] NFCORE_SAREK:PREPARE_GENOME:SAMTOOLS_FAIDX                                     -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_BCFTOOLS_ANNOTATIONS                         -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_DBSNP                                        -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_GERMLINE_RESOURCE                            -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_KNOWN_SNPS                                   -
executor >  local (4)
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM1_INDEX                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM2_INDEX                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:DRAGMAP_HASHTABLE                                  -
[-        ] NFCORE_SAREK:PREPARE_GENOME:GATK4_CREATESEQUENCEDICTIONARY                     -
[-        ] NFCORE_SAREK:PREPARE_GENOME:MSISENSORPRO_SCAN                                  -
executor >  local (4)
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM1_INDEX                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM2_INDEX                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:DRAGMAP_HASHTABLE                                  -
[-        ] NFCORE_SAREK:PREPARE_GENOME:GATK4_CREATESEQUENCEDICTIONARY                     -
[-        ] NFCORE_SAREK:PREPARE_GENOME:MSISENSORPRO_SCAN                                  -
executor >  local (4)
executor >  local (5)
executor >  local (5)
executor >  local (5)
executor >  local (5)
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM1_INDEX                                                                    -
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM2_INDEX                                                                    -
[-        ] NFCORE_SAREK:PREPARE_GENOME:DRAGMAP_HASHTABLE                                                                -
[6c/b0ee11] NFCORE_SAREK:PREPARE_GENOME:GATK4_CREATESEQUENCEDICTIONARY (GCA_000001405.15_GRCh38_no_alt_analysis_set.fna) [100%] 1 of 1 ✔
[-        ] NFCORE_SAREK:PREPARE_GENOME:MSISENSORPRO_SCAN                                                                -
[-        ] NFCORE_SAREK:PREPARE_GENOME:SAMTOOLS_FAIDX                                                                   -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_BCFTOOLS_ANNOTATIONS                                                       -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_DBSNP                                                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_GERMLINE_RESOURCE                                                          -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_KNOWN_SNPS                                                                 -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_KNOWN_INDELS                                                               -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_PON                                                                        -
[-        ] NFCORE_SAREK:PREPARE_INTERVALS:CREATE_INTERVALS_BED                                                          -
[-        ] NFCORE_SAREK:PREPARE_INTERVALS:TABIX_BGZIPTABIX_INTERVAL_SPLIT                                               -
[-        ] NFCORE_SAREK:PREPARE_INTERVALS:TABIX_BGZIPTABIX_INTERVAL_COMBINED                                            -
[-        ] NFCORE_SAREK:SAREK:SPRING_DECOMPRESS_TO_FQ_PAIR                                                              -
[-        ] NFCORE_SAREK:SAREK:SPRING_DECOMPRESS_TO_R1_FQ                                                                -
[-        ] NFCORE_SAREK:SAREK:SPRING_DECOMPRESS_TO_R2_FQ                                                                -
[-        ] NFCORE_SAREK:SAREK:CONVERT_FASTQ_INPUT:SAMTOOLS_VIEW_MAP_MAP                                                 -
[-        ] NFCORE_SAREK:SAREK:CONVERT_FASTQ_INPUT:SAMTOOLS_VIEW_UNMAP_UNMAP                                             -
[4f/3e1b79] NFCORE_SAREK:SAREK:FASTP (DAC021-L2)                                                                         [100%] 4 of 4 ✔
Plus 24 more processes waiting for tasks…
ERROR ~ Error executing process > 'NFCORE_SAREK:PREPARE_INTERVALS:CREATE_INTERVALS_BED'

Caused by:
  Failed to create Conda environment
    command: conda env create --prefix /home/eduardo/gargantua/temp/conda/gawk-f9b4f646bf1b67e52890543d27cb2f27 --file /home/eduardo/.nextflow/assets/nf-core/sarek/./subworkflows/local/prepare_intervals/../../../modules/local/create_intervals_bed/environment.yml
    status : 1
    message:
      Channels:
executor >  local (5)
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM1_INDEX                                                                    -
[-        ] NFCORE_SAREK:PREPARE_GENOME:BWAMEM2_INDEX                                                                    -
[-        ] NFCORE_SAREK:PREPARE_GENOME:DRAGMAP_HASHTABLE                                                                -
[6c/b0ee11] NFCORE_SAREK:PREPARE_GENOME:GATK4_CREATESEQUENCEDICTIONARY (GCA_000001405.15_GRCh38_no_alt_analysis_set.fna) [100%] 1 of 1 ✔
[-        ] NFCORE_SAREK:PREPARE_GENOME:MSISENSORPRO_SCAN                                                                -
[-        ] NFCORE_SAREK:PREPARE_GENOME:SAMTOOLS_FAIDX                                                                   -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_BCFTOOLS_ANNOTATIONS                                                       -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_DBSNP                                                                      -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_GERMLINE_RESOURCE                                                          -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_KNOWN_SNPS                                                                 -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_KNOWN_INDELS                                                               -
[-        ] NFCORE_SAREK:PREPARE_GENOME:TABIX_PON                                                                        -
[-        ] NFCORE_SAREK:PREPARE_INTERVALS:CREATE_INTERVALS_BED                                                          -
[-        ] NFCORE_SAREK:PREPARE_INTERVALS:TABIX_BGZIPTABIX_INTERVAL_SPLIT                                               -
[-        ] NFCORE_SAREK:PREPARE_INTERVALS:TABIX_BGZIPTABIX_INTERVAL_COMBINED                                            -
[-        ] NFCORE_SAREK:SAREK:SPRING_DECOMPRESS_TO_FQ_PAIR                                                              -
[-        ] NFCORE_SAREK:SAREK:SPRING_DECOMPRESS_TO_R1_FQ                                                                -
[-        ] NFCORE_SAREK:SAREK:SPRING_DECOMPRESS_TO_R2_FQ                                                                -
[-        ] NFCORE_SAREK:SAREK:CONVERT_FASTQ_INPUT:SAMTOOLS_VIEW_MAP_MAP                                                 -
[-        ] NFCORE_SAREK:SAREK:CONVERT_FASTQ_INPUT:SAMTOOLS_VIEW_UNMAP_UNMAP                                             -
[4f/3e1b79] NFCORE_SAREK:SAREK:FASTP (DAC021-L2)                                                                         [100%] 4 of 4 ✔
Plus 24 more processes waiting for tasks…
ERROR ~ Error executing process > 'NFCORE_SAREK:PREPARE_INTERVALS:CREATE_INTERVALS_BED'

Caused by:
  Failed to create Conda environment
    command: conda env create --prefix /home/eduardo/gargantua/temp/conda/gawk-f9b4f646bf1b67e52890543d27cb2f27 --file /home/eduardo/.nextflow/assets/nf-core/sarek/./subworkflows/local/prepare_intervals/../../../modules/local/create_intervals_bed/environment.yml
    status : 1
    message:
      Channels:
       - conda-forge
       - bioconda
       - defaults
       - anaconda
      Platform: linux-64
      Collecting package metadata (repodata.json): ...working... done
      Solving environment: ...working... failed
      Channels:
       - conda-forge
       - bioconda
       - defaults
       - anaconda
      Platform: linux-64
      Collecting package metadata (repodata.json): ...working... done
      Solving environment: ...working... failed

      LibMambaUnsatisfiableError: Encountered problems while solving:
        - package gawk-5.1.0-h7b6447c_0 is excluded by strict repo priority

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details
-[nf-core/sarek] Pipeline completed with errors-)

Relevant files

Google drive files

System information

Ubuntu 24.04.1 LTS mamba 1.5.8 conda 24.3.0 nextflow 24.04.4 sarek v3.4.4-g5cc3049

FriederikeHanssen commented 3 weeks ago

I don't think this is an issue with the bed file itself but more an issue with conda. Do you have to run it with conda or are you able to use a container engine?

Eduardo-Auer commented 3 weeks ago

Thank you for your response! Currently, I am confortable with mamba/conda because I can understand better and use without any issue until now. Before, I tried using/installing docker in my Ubuntu and I had so much problems that I gave up using docker. Is there any solution to use in mamba/conda environment?

FriederikeHanssen commented 3 weeks ago

Not sure, there seems to be an issue with resolving the conda environment on your system. You could try creating it manually to better see what is going on:

Failed to create Conda environment
    command: conda env create --prefix /home/eduardo/gargantua/temp/conda/gawk-f9b4f646bf1b67e52890543d27cb2f27 --file /home/eduardo/.nextflow/assets/nf-core/sarek/./subworkflows/local/prepare_intervals/../../../modules/local/create_intervals_bed/environment.yml