Open xiekunwhy opened 1 year ago
Wangsen, Could you answer this question?
@.***
From: xiekunwhy Date: 2023-11-05 23:33 To: fanagislab/EndHiC CC: Subscribed Subject: [fanagislab/EndHiC] how to use asm_error_check.pl results to continue the pipeline? (Issue #9) Hi, How to use asm_error_check.pl to continue the pipeline? We need to break fasta sequences and re-start from HiC-Pro read mapping? Best, Kun — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi Kun,
It should be noted that the contig break function by Hi-C data is not very convincible, and we suggest you get the accurate positions of contig assembly errors based on multiple information, such as the graph structure of Hifiasm, Hi-C heatmap within contig, etc.
If you have already known the positions to break contigs, there is no need to rerun the whole HiC-Pro mapping pipeline. For example, if you think the prediction results of asm_error_check.pl
is right, then extract the contigs and their break positions and use the following scripts split_len.pl
, split_bed.pl
, and split_fasta.pl
to break the corresponding contigs.len
, hic.bed
, and contigs.fa
file. And the generated new files (contigs.splited.len
and hic.splited.bed
) and original hic.matrix
can be used as input of endhic.pl
.
perl split_len.pl ctg_split.pos contigs_all.len > contigs_all.ctg_splited.len
perl split_bed.pl ctg_split.pos hic_10000_abs.bed > hic_10000_abs_splited.bed
perl split_fasta.pl ctg_split.pos contigs.fasta > contigs_splited.fasta
The three scripts are attached below as a ZIP file.
splt_ctg.zip
Thank for your reply and suggestions, I will try it.
Best, Kun
Hi,
How to use asm_error_check.pl to continue the pipeline? We need to break fasta sequences and re-start from HiC-Pro read mapping?
Best, Kun