Open MboiTui opened 5 months ago
Hello @MboiTui,
Yes, it is clear from the HiC map they should not be joined.
Have you zoomed into that region to check the local HiC contacts? Probably at the scale of a few hundred Kb. My guess is that there is a small piece of redundant sequence. They could be haplotype duplications which are too divergent to be identified as homologous sequences by purge_dups but similar enough to be captured by HiC signal. If that is the case, you can remove that piece of sequence (and should probably put it into the other haplotype assembly if it is missed there).
I can comment further if you show me the HiC maps around that point.
Best, Chenxi
Hello Chenxi,
Thank you for your reply. Here's a link to a higher definition HiC map for linkage group 1. It's 123 Mb so had to provided as a link. Let me know if another option is preferred.
Here are two screenshot of a close up of the area of interest.
Hi @MboiTui,
Thanks very much for sharing the image. This is very informative.
This is not the case I mentioned above about the haplotype duplications. Could you help me check one more thing? You should be able to find the position of this misjoining point on this scaffold from the HiC plot. With this position, you can refer to the AGP file generated by YaHS to check if this misjoining point is within a GAP or a contig. If it is within a gap, there is something wrong with the contig joining process; otherwise, this is a misassembly introduced by the assembler and YaHS failed to error-correct it. In either case, I would like to fix it.
I do not think you want to wait for the fix from my end. For your project, you should probably simply break it at this misjoining position. It would be easy if it is within a GAP; otherwise, you may need to use the synteny information with the other haplotype to find the exact breakpoint.
Best, Chenxi
Hello @c-zhou,
Thank you for your prompt response. I am finishing up a different project at the moment with a strict timeline. I will get back to you with the info you requested in a couple of weeks.
Thanks for a great software and for the good support.
Best, LVB
Thanks @MboiTui! Chenxi
Hello @c-zhou,
Thank you for your insight.
I have checked the HiC contact map with PretextView, and the gap appears to be at 201,525,000 bp. Looking at the AGP file from YAHS, this is within a contig and not a gap (if i interpreted the AGP file correctly), so it should be an issue within the assembler?
scaffold_1 201157251 201157450 548 N 200 scaffold yes proximity_ligation
scaffold_1 201157451 201543633 549 W h2tg001723l_path_1 1 386183 -
scaffold_1 201543634 201543833 550 N 200 scaffold yes proximity_ligation
scaffold_1 201543834 203019410 551 W h2tg001684l_path_1 47001 1522577 +
scaffold_1 203019411 203019610 552 N 200 scaffold yes proximity_ligation
scaffold_1 203019611 203062164 553 W h2tg006944l_path_1 16001 58554 +
I will use synteny information with the other haplotype to identify the breakpoint
Thank you for your help, I appreciate it
Hello,
I am writing here as I could not find a google group for yahs. This is not a software issue, more of an unexpected behaviour in my data I could use help interpreting.
I am working with Pacbio HiFi and Arima HiC data, and I followed the VGP pipeline to produce the assembly (briefly: merqury > hifiasm > purge duplicates > yahs).
The result is a very contiguous and good assembly overall (1n = 13; L90 = 12, N50 = 224 Mb. Expected genome size is 2.7 Gb)
Yet, for the alternate haplotype, I get two chromosome merged into one. I identified this based on comparisons to linkage maps for the same species, and comparison between the two haplotypes using chromeister. The misjoin is also very obvious in the pretext map for the joined chromosomes.
Linkage map versus haplotype 1
Linkage map versus haplotype 2
Haplotype 1 versus Haplotype 2
Pretext map for misjoined scaffold in haplotype 2
I was wondering what could be causing this, as the HiC data does not seem to point to many contacts between these two scaffolds? And if possible, how could I address this to avoid the misjoin?
Cheers, LVB