HapHiC: a fast, reference-independent, allele-aware scaffolding tool based on Hi-C data
BSD 3-Clause "New" or "Revised" License
141
stars
10
forks
source link
Error about "haphic pipeline p_utg HiC.filtered.bam nchrs --gfa p_utg.gfa". #59
Closed
yangyyhh closed 2 months ago
Dear zeng: I need to assemble a homozygous tetraploid and perform genotyping. I plan to use "work with hifisam". haphic pipeline p_utg HiC.filtered.bam nchrs --gfa p_utg.gfa. My commond: cat YZ4.hic.hap1.p_ctg.gfa YZ4.hic.hap2.p_ctg.gfa YZ4.hic.hap3.p_ctg.gfa YZ4.hic.hap4.p_ctg.gfa>allhaps.gfa python mydata/16_YZ/01_rawdata/fasta.py#Custom script to convert GFA files to FA files. bwa index allhaps.fa bwa mem -5SP -t 28 allhaps.fa ../YZ4_HiC-clean_1.fq.gz ../YZ4_HiC-clean_2.fq.gz | samblaster | samtools view - -@ 14 -S -h -b -F 3340 -o allhaps_HiC.bam /01_software/HapHiC-main/utils/filter_bam allhaps_HiC.bam 1 --nm 3 --threads 30 | samtools view - -b -@ 30 -o allhaps_HiC.filtered.bam /01_software/HapHiC-main/haphic pipeline YZ4.hic.p_utg.fa allhaps_HiC.filtered.bam 44 --gfa YZ4.hic.p_utg.gfa --RE "GATC" --remove_allelic_links 4 --threads 30 --processes 5 The Error happen:
/01_software/HapHiC-main/haphic pipeline YZ4.hic.p_utg.fa allhaps_HiC.filtered.bam 44 --gfa YZ4.hic.p_utg.gfa --RE "GATC" --remove_allelic_links 4 --threads 30 --processes 5 2024-08-31 13:48:49 [main] Pipeline started, HapHiC version: 1.0.5 (update: 2024.08.22)
2024-08-31 13:48:49 [main] Python version: 3.9.7 (default, Mar 8 2023, 17:00:06) [GCC 7.5.0]
2024-08-31 13:48:49 [main] Command: /01_software/HapHiC-main/scripts/HapHiC_pipeline.py YZ4.hic.p_utg.fa allhaps_HiC.filtered.bam 44 --gfa YZ4.hic.p_utg.gfa --RE GATC --remove_allelic_links 4 --threads 30 --processes 5
2024-08-31 13:48:49 [haphic_cluster] Step1: Execute preprocessing and Markov clustering for contigs...
2024-08-31 13:48:49 [run] Program started, HapHiC version: 1.0.5 (update: 2024.08.22)
2024-08-31 13:48:49 [run] Python version: 3.9.7 (default, Mar 8 2023, 17:00:06) [GCC 7.5.0]
2024-08-31 13:48:49 [run] Command: /01_software/HapHiC-main/scripts/HapHiC_pipeline.py YZ4.hic.p_utg.fa allhaps_HiC.filtered.bam 44 --gfa YZ4.hic.p_utg.gfa --RE GATC --remove_allelic_links 4 --threads 30 --processes 5
2024-08-31 13:48:49 [run] Module sparse_dot_mkl or Intel MKL is not correctly installed, HapHiC will be executed in dense matrix mode
2024-08-31 13:48:49 [detect_format] The file for Hi-C read alignments is detected as being in BAM format
2024-08-31 13:48:49 [parse_fasta] Parsing input FASTA file...
2024-08-31 13:49:08 [parse_gfa] Parsing input gfa file(s)...
2024-08-31 13:49:11 [stat_fragments] Making some statistics of fragments (contigs / bins)
2024-08-31 13:49:11 [stat_fragments] bin_size is calculated to be 1188831 bp
2024-08-31 13:49:16 [parse_alignments] Parsing input alignments...
2024-08-31 13:52:42 [output_pickle] Writing HT_link_dict to HT_links.pkl...
2024-08-31 13:52:42 [output_clm] Writing clm_dict to paired_links.clm...
2024-08-31 13:52:42 [filter_fragments] Filtering fragments...
2024-08-31 13:52:42 [filter_fragments] [Nx filtering] 1144 fragments kept
2024-08-31 13:52:42 [filter_fragments] [RE sites filtering] 0 fragments removed, 1144 fragments kept
2024-08-31 13:52:42 [filter_fragments] [link density filtering] Parameter --density_lower 0.2X is set to "multiple" mode and equivalent to 0.0 in "fraction" mode
2024-08-31 13:52:42 [filter_fragments] [link density filtering] Parameter --density_upper 1.9X is set to "multiple" mode and equivalent to 1.0 in "fraction" mode
2024-08-31 13:52:42 [filter_fragments] [link density filtering] 0 fragments removed, 1144 fragments kept
2024-08-31 13:52:42 [filter_fragments] [read depth filtering] Q1=17.0, median=17.0, Q3=18.0, IQR=Q3-Q1=1.0
2024-08-31 13:52:42 [filter_fragments] [read depth filtering] Parameter --read_depth_upper 1.5X is set to "multiple" mode and equivalent to 0.9458041958041958 in "fraction" mode
2024-08-31 13:52:42 [filter_fragments] [read depth filtering] 57 fragments removed, 1082 fragments kept
2024-08-31 13:52:43 [filter_fragments] [rank sum filtering] Q1=120.0, median=120.0, Q3=120.0, IQR=Q3-Q1=0.0
2024-08-31 13:52:43 [filter_fragments] [rank sum filtering] Parameter --rank_sum_upper 1.5X is set to "multiple" mode and equivalent to 1.0 in "fraction" mode
2024-08-31 13:52:43 [filter_fragments] [rank sum filtering] 0 fragments removed, 1082 fragments kept
2024-08-31 13:52:43 [remove_allelic_HiC_links] Removing Hi-C links between alleic contig pairs...
2024-08-31 13:52:45 [output_pickle] Writing full_link_dict to full_links.pkl...
Traceback (most recent call last):
File "/01_software/HapHiC-main/scripts/HapHiC_pipeline.py", line 532, in
main()
File "/01_software/HapHiC-main/scripts/HapHiC_pipeline.py", line 513, in main
haphic_cluster(args)
File "/01_software/HapHiC-main/scripts/HapHiC_pipeline.py", line 355, in haphic_cluster
HapHiC_cluster.run(args, log_file=LOG_FILE)
File "/01_software/HapHiC-main/scripts/HapHiC_cluster.py", line 2887, in run
flank_link_matrix, frag_index_dict = dict_to_matrix(
File "/01_software/HapHiC-main/scripts/HapHiC_cluster.py", line 289, in dict_to_matrix
shape = len(frag_set)
TypeError: object of type 'NoneType' has no len()
Traceback (most recent call last):
File "/01_software/HapHiC-main/haphic", line 117, in
subprocess.run(commands, check=True)
File "/01_software/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/01_software/HapHiC-main/scripts/HapHiC_pipeline.py', 'YZ4.hic.p_utg.fa', 'allhaps_HiC.filtered.bam', '44', '--gfa', 'YZ4.hic.p_utg.gfa', '--RE', 'GATC', '--remove_allelic_links', '4', '--threads', '30', '--processes', '5']' returned non-zero exit status 1.
How should this error be resolved? Hoping your answer!
Best wishes.