nf-core / hic

Analysis of Chromosome Conformation Capture data (Hi-C)
https://nf-co.re/hic
MIT License
81 stars 55 forks source link

bowtie2_end_to_end failed on 1.3.0 but worked on 1.2.2 #116

Closed Nico-FR closed 1 year ago

Nico-FR commented 2 years ago

Hello, I am trying to run nf-core/hic on cow. It worked on the 1.2.2 with this command line:

nextflow run nf-core/hic \
-r 1.2.2 \
-profile genotoul \
-name v1.2.2 \
--input '/work/nmary/Bovin/Nf-core/977*_R{1,2}.fastq.gz' \
--fasta '/bank/bowtie2db/ensembl_bos_taurus_genome' \
--bwt2_index '/bank/bowtie2db/ensembl_bos_taurus_genome' \
--restriction_site '^GATC,G^ANTC' \
--ligation_site 'GATCGATC,GANTGATC,GANTANTC,GATCANTC' \
--bin_size '200000,10000' \
--min_insert_size 20 \
--max_insert_size 1000 \
--split_fastq --fastq_chunks_size '10000000'

But not on the 1.3.0 with the same command:

nextflow run nf-core/hic \
-r 1.3.0 \
-profile genotoul \
-name test11 \
--input '/work/nmary/Bovin/Nf-core/977*_R{1,2}.fastq.gz' \
--fasta '/bank/bowtie2db/ensembl_bos_taurus_genome' \
--bwt2_index '/bank/bowtie2db/ensembl_bos_taurus_genome' \
--restriction_site '^GATC,G^ANTC' \
--ligation_site 'GATCGATC,GANTGATC,GANTANTC,GATCANTC' \
--bin_size '200000,10000' \
--min_insert_size 20 \
--max_insert_size 1000 \
--split_fastq --fastq_chunks_size '10000000' \
--res_dist_decay '200000,50000,10000,5000' \
--res_compartments '200000,50000,10000' \
--tads_caller 'hicexplorer' \
--res_tads '200000,100000,50000,25000,10000,5000'
[43/2a6685] process > bowtie2_end_to_end (977_TTA... [100%] 1 of 1, failed: 1

Execution cancelled -- Finishing pending tasks before exit
- Ignore this warning: params.schema_ignore_params = "igenomesIgnore" 
- Ignore this warning: params.schema_ignore_params = "igenomesIgnore" 
WARN: Found unexpected parameters:
* --igenomesIgnore: true
WARN: Got an interrupted exception while taking agent result | java.lang.InterruptedException
WARN: Found unexpected parameters:
* --igenomesIgnore: true
Error executing process > 'bowtie2_end_to_end (977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1)'

Caused by:
  Process `bowtie2_end_to_end (977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1)` terminated with an error exit status (255)

Command executed:

  INDEX=`find -L ./ -name "*.rev.1.bt2" | sed 's/.rev.1.bt2//'`
    bowtie2 --rg-id BMG --rg SM:977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1 \
  --very-sensitive -L 30 --score-min L,-0.6,-0.2 --end-to-end --reorder \
  -p 4 \
  -x ${INDEX} \
  --un 977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1_unmap.fastq \
        -U 977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1.fastq.gz | samtools view -F 4 -bS - > 977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1.bam

Command exit status:
  255

Command output:
  (empty)

Command error:
  (ERR): "--passthrough" does not exist or is not a Bowtie 2 index
  Exiting now ...

So I tryed to build the index in the curent directory with bowtie2-build ./Bos_taurus.ARS-UCD1.2.dna_sm.toplevel.fa ./Bos_taurus.ARS-UCD1.2.dna_sm.toplevel.fa with the same issue

[e1/a532aa] process > bowtie2_end_to_end (977_TTA... [100%] 1 of 1, failed: 1

Execution cancelled -- Finishing pending tasks before exit
- Ignore this warning: params.schema_ignore_params = "igenomesIgnore" 
- Ignore this warning: params.schema_ignore_params = "igenomesIgnore" 
WARN: Found unexpected parameters:
* --igenomesIgnore: true
WARN: Got an interrupted exception while taking agent result | java.lang.InterruptedException
WARN: Found unexpected parameters:
* --igenomesIgnore: true
Error executing process > 'bowtie2_end_to_end (977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1)'

Caused by:
  Process `bowtie2_end_to_end (977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1)` terminated with an error exit status (255)

Command executed:

  INDEX=`find -L ./ -name "*.rev.1.bt2" | sed 's/.rev.1.bt2//'`
    bowtie2 --rg-id BMG --rg SM:977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1 \
  --very-sensitive -L 30 --score-min L,-0.6,-0.2 --end-to-end --reorder \
  -p 4 \
  -x ${INDEX} \
  --un 977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1_unmap.fastq \
        -U 977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1.fastq.gz | samtools view -F 4 -bS - > 977_TTATAACC-TCGATATC-BHV5JHDSXY_L003_R1.1.bam

Command exit status:
  255

Command output:
  (empty)

Command error:
  (ERR): "--passthrough" does not exist or is not a Bowtie 2 index
  Exiting now ...
nservant commented 2 years ago

Hi, Thanks for the issue. The main difference between v1.2 and 1.3, is that since v1.3, we are looking for the indexes' prefix in the index folder using

INDEX=`find -L ./ -name "*.rev.1.bt2" | sed 's/.rev.1.bt2//'`

So , the --bowtie_index parameter expects a directory, and no longer a prefix. While, in v1.2, the pipeline was using the path + prefix of indexes. Best

Nico-FR commented 2 years ago

Hi, Indeed, it worked with the directory name. Merry christmas,

Nico-FR commented 2 years ago

Hi, I finally managed to launch the pipeline completely. It seems that there is another issue. I have launched the pipeline with 6 resolutions:

--res_compartments '800000,400000,200000,100000,50000' \
--tads_caller 'insulation,hicexplorer' \
--res_tads '200000,100000,50000,25000,10000,5000'

And the pipeline has lauched 6 jobs.

[be/c8092b] process > tads_hicexplorer (Bovin-365... [100%] 6 of 6 ✔
[22/7ac14e] process > tads_insulation (Bovin-3654... [100%] 6 of 6 ✔

However I have only one output:

ls ./tads/hicexplorer/
tad_boundaries.bed  tad_boundaries.gff  tad_domains.bed  tad_score.bedgraph

The same for insulation. It seems that they are overwrite by each job because they all have the same prefix. Here is the command run by the pipeline:

hicFindTADs --matrix Bovin-3654_CCGCGGTT-CTAGCGCT-AHT2HCDSX2_L004_5000_norm.cool           
--outPrefix tad         
--correctForMultipleTesting fdr         
--numberOfProcessors 4

For more clarity I would recommend to keep the prefix of the input in full (I was wondering if it took normalized matrices or not).

hicFindTADs --matrix Bovin-3654_CCGCGGTT-CTAGCGCT-AHT2HCDSX2_L004_5000_norm.cool           
--outPrefix Bovin-3654_CCGCGGTT-CTAGCGCT-AHT2HCDSX2_L004_5000_norm         
--correctForMultipleTesting fdr         
--numberOfProcessors 4

Thank you so much for this great pipeline, it helped me a lot!

nservant commented 2 years ago

Hi Nicolas Thanks for testing ! Could you open an new issue for the last point please ? I'll check it and correct it for the next version Thanks

Nico-FR commented 2 years ago

ok, see https://github.com/nf-core/hic/issues/117