nf-core / hic

Analysis of Chromosome Conformation Capture data (Hi-C)
https://nf-co.re/hic
MIT License
85 stars 55 forks source link

reported bugs #62

Open nservant opened 4 years ago

nservant commented 4 years ago

I am running the nf-core/hic v1.1.0 pipeline from Nextflow v20.01.0 and I have a few comments and questions:

1) If a single number is provided as bin_size (eg: --bin_size '1000000') an error is obtained: Unknown method tokenize on Integer type -- Check script '/home/sfoissac/.nextflow/assets/nf-core/hic/main.nf' at line: 226 or see '.nextflow.log' file for more details This does not happen when two numbers are provided (eg: --bin_size '1000000,500000')

2) There is a typo in the main.nf about the "max_restriction_framgnet_size (with "fragmants" in the description). I get a warning: WARN nextflow.script.ScriptBinding - Access to undefined parameter max_restriction_framgnet_size -- Initialise it to a default value eg. params.max_restriction_framgnet_size = some_value

3) I think that the information about the Arima Hi-C kit is wrong on the documentation page https://github.com/nf-core/hic/blob/master/docs/usage.md Instead of "ARIMA kit: ^GATC,^GANT" I believe it should be "ARIMA kit: ^GATC,G^ANTC" and "Exemple of the ARIMA kit: GATCGATC,GATCGANT,GANTGATC,GANTGANT" might be "Exemple of the ARIMA kit: GATCGATC,GANTGATC,GANTANTC,GATCANTC".

This is what I understood from https://arimagenomics.com/public/pdf/ArimaGenomics_Genome-Assembly_Datasheet_01-2019.pdf and https://www.bioinformatics.babraham.ac.uk/projects/hicup/read_the_docs/html/index.html --arima | Set the –re1 option to that used by the Arima protocol: ^GATC,DpnII:G^ANTC,Arima

4) I get a strange error while running a little dataset of ~2M reads

Jun-10 12:28:32.563 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'combine_mapped_files (MB_HiC_liver_1_11_S2_all_R2_001 = MB_HiC_liver_1_11_S2_all_R2_001 +
 null)'
Caused by:
  Process `combine_mapped_files (MB_HiC_liver_1_11_S2_all_R2_001 = MB_HiC_liver_1_11_S2_all_R2_001 + null)` terminated with an error exit status (1)
Command executed:
  mergeSAM.py -f MB_HiC_liver_1_11_S2_all_R2_001_bwt2merged.bam -r null -o MB_HiC_liver_1_11_S2_all_R2_001_bwt2pairs.bam -t -q 10
Command exit status:
  1
Command output:
  (empty)
Command error:
  [E::hts_open_format] Failed to open file null
  Traceback (most recent call last):
    File "/home/sfoissac/.nextflow/assets/nf-core/hic/bin/mergeSAM.py", line 222, in <module>
      with  pysam.Samfile(R1file, "rb") as hr1,  pysam.Samfile(R2file, "rb") as hr2:
    File "pysam/libcalignmentfile.pyx", line 736, in pysam.libcalignmentfile.AlignmentFile.__cinit__
    File "pysam/libcalignmentfile.pyx", line 935, in pysam.libcalignmentfile.AlignmentFile._open
  IOError: [Errno 2] could not open alignment file `null`: No such file or directory

I am guessing that something went wrong during one of the previous mapping steps?

The only results files I have are the mmapstat files:

cat hic_results/stats/MB_HiC_liver_1_11_S2_all_R1_001/mstats/MB_HiC_liver_1_11_S2_all_R1_001/MB_HiC_liver_1_11_S2_all_R1_001.mmapstat
total_R2 2294915
mapped_R2 2081255
global_R2 1967565
local_R2 113690
cat hic_results/stats/MB_HiC_liver_1_11_S2_all_R2_001/mstats/MB_HiC_liver_1_11_S2_all_R2_001/MB_HiC_liver_1_11_S2_all_R2_001.mmapstat
total_R2 2294915
mapped_R2 2050220
global_R2 1938636
local_R2 111584

Any idea about what I missed? I am attaching the log file.

nservant commented 4 years ago

Point 2 is fixed in PR#11

nservant commented 4 years ago

nf-core 1.2.0 just released which fix point 2 and 3