Closed LQHHHHH closed 4 years ago
What is the coverage of your HiC reads? I assume that it is possibly due to low coverage of HiC sequencing or the mapping issues. Please also check your mapping bam. Is it normal or disrupt?
Hi, @tangerzhang
The coverage of my hi-c data is ~200x. So I checked the sample.clean.bam
file which generated by Map Hi-C reads to draft assembly
step. After filtering, 99% of my Hi-C reads were mapped to 17 longest contigs and other short contigs were nearly no reads mapped and my chromosome number were 17. Then I checked the raw aligned bam generated bybwa aln && baw sampe
and found the reads can be found in other short contigs.
ALLHiC only provided 6 groups which included 12 contigs, however, these contigs were short and little hi-c reads were mapped. It's very strange.
OK, I guess that the 17 longest contigs are actually chromosomal level assembly. Perhaps you do not need Hi-C reads for scaffolding.
Dr. Zhang, Thank you for your reply.
But why AllHiC cannot work in this case? I run SALSA2 and it corrected some misjoin of my contigs and give me FINAL scaffolds. But it gives me over 19 scaffolds.
You can also use ALLHiC_corrector to correct the chimeric contigs and then use ALLHiC to build the chromosomal level assembly. What I meant before is that the 17 longest contigs likely represent 17 pseudo-chromosomes. If they account for a large proportion of genome sequences (e.g. >90%), you will not need to perform scaffolding.
Thank you! Last question, Should I remove these short contigs (length<500kb) first, before performing mapping step?
There is no need to remove the short contigs as these contigs have very limited restriction sites (cutoff: 25) and thus will not be included in the Hi-C scaffolding.
Thank you!
Hi, @tangerzhang
I have finished my contig-level assembly using hifiasm with hifi reads. Because Purge-dups was contained in hifiasm, so I directly using Purge-dups contigs. Following your tutorial (https://github.com/tangerzhang/ALLHiC/wiki), I skip the Prune step and after Partition step, I found only 6 groups were given by ALLHiC, but -k was 17 or higher. It's my contigs includes so many misjoin or any steps I did wrong?