I am running the nf-core/hic v1.1.0 pipeline from Nextflow v20.01.0 and I have a few comments and questions:
1) If a single number is provided as bin_size (eg: --bin_size '1000000') an error is obtained:
Unknown method tokenize on Integer type
-- Check script '/home/sfoissac/.nextflow/assets/nf-core/hic/main.nf' at line: 226 or see '.nextflow.log' file for more details
This does not happen when two numbers are provided (eg: --bin_size '1000000,500000')
2) There is a typo in the main.nf about the "max_restriction_framgnet_size (with "fragmants" in the description). I get a warning:
WARN nextflow.script.ScriptBinding - Access to undefined parameter max_restriction_framgnet_size -- Initialise it to a default value eg. params.max_restriction_framgnet_size = some_value
3) I think that the information about the Arima Hi-C kit is wrong on the documentation page
https://github.com/nf-core/hic/blob/master/docs/usage.md
Instead of "ARIMA kit: ^GATC,^GANT" I believe it should be "ARIMA kit: ^GATC,G^ANTC"
and
"Exemple of the ARIMA kit: GATCGATC,GATCGANT,GANTGATC,GANTGANT" might be
"Exemple of the ARIMA kit: GATCGATC,GANTGATC,GANTANTC,GATCANTC".
4) I get a strange error while running a little dataset of ~2M reads
Jun-10 12:28:32.563 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'combine_mapped_files (MB_HiC_liver_1_11_S2_all_R2_001 = MB_HiC_liver_1_11_S2_all_R2_001 +
null)'
Caused by:
Process `combine_mapped_files (MB_HiC_liver_1_11_S2_all_R2_001 = MB_HiC_liver_1_11_S2_all_R2_001 + null)` terminated with an error exit status (1)
Command executed:
mergeSAM.py -f MB_HiC_liver_1_11_S2_all_R2_001_bwt2merged.bam -r null -o MB_HiC_liver_1_11_S2_all_R2_001_bwt2pairs.bam -t -q 10
Command exit status:
1
Command output:
(empty)
Command error:
[E::hts_open_format] Failed to open file null
Traceback (most recent call last):
File "/home/sfoissac/.nextflow/assets/nf-core/hic/bin/mergeSAM.py", line 222, in <module>
with pysam.Samfile(R1file, "rb") as hr1, pysam.Samfile(R2file, "rb") as hr2:
File "pysam/libcalignmentfile.pyx", line 736, in pysam.libcalignmentfile.AlignmentFile.__cinit__
File "pysam/libcalignmentfile.pyx", line 935, in pysam.libcalignmentfile.AlignmentFile._open
IOError: [Errno 2] could not open alignment file `null`: No such file or directory
I am guessing that something went wrong during one of the previous mapping steps?
The only results files I have are the mmapstat files:
I am running the nf-core/hic v1.1.0 pipeline from Nextflow v20.01.0 and I have a few comments and questions:
1) If a single number is provided as bin_size (eg: --bin_size '1000000') an error is obtained: Unknown method
tokenize
on Integer type -- Check script '/home/sfoissac/.nextflow/assets/nf-core/hic/main.nf' at line: 226 or see '.nextflow.log' file for more details This does not happen when two numbers are provided (eg: --bin_size '1000000,500000')2) There is a typo in the main.nf about the "max_restriction_framgnet_size (with "fragmants" in the description). I get a warning: WARN nextflow.script.ScriptBinding - Access to undefined parameter
max_restriction_framgnet_size
-- Initialise it to a default value eg.params.max_restriction_framgnet_size = some_value
3) I think that the information about the Arima Hi-C kit is wrong on the documentation page https://github.com/nf-core/hic/blob/master/docs/usage.md Instead of "ARIMA kit: ^GATC,^GANT" I believe it should be "ARIMA kit: ^GATC,G^ANTC" and "Exemple of the ARIMA kit: GATCGATC,GATCGANT,GANTGATC,GANTGANT" might be "Exemple of the ARIMA kit: GATCGATC,GANTGATC,GANTANTC,GATCANTC".
This is what I understood from https://arimagenomics.com/public/pdf/ArimaGenomics_Genome-Assembly_Datasheet_01-2019.pdf and https://www.bioinformatics.babraham.ac.uk/projects/hicup/read_the_docs/html/index.html --arima | Set the –re1 option to that used by the Arima protocol: ^GATC,DpnII:G^ANTC,Arima
4) I get a strange error while running a little dataset of ~2M reads
I am guessing that something went wrong during one of the previous mapping steps?
The only results files I have are the mmapstat files:
Any idea about what I missed? I am attaching the log file.