glogsdon1 / sunk-based_assembly

14 stars 2 forks source link

T2T chr8 Bionano assembly evaluation #6

Closed zhoudreames closed 3 years ago

zhoudreames commented 3 years ago

Hi,Glennis Logsdon In your article ,you find the misassembly by using opticle data to call SV,but in your second version,you used the haplotype-aware parameters de nove assemly the map ,This map was also aligned to GRCh38 and the final CHM13 assembly to verify heterozygous location. In my understand ,the chm13 dividual is haplotype,why using haplotype-aware parameters to de nove assembly the chm13 opticle map ? ,And ,what’s the heterozygous location?I take the liberty to ask if you can provide me to code? I don’t understand the whole method,so I afriad I can’t corret using your method

glogsdon1 commented 3 years ago

This is a really good question and one that I also asked our collaborators who did the analysis at the time. From my emails with them, they ran both the non-haplotype-aware and haplotype-aware versions of the software (Solve) as part of their pipeline and initially only analyzed the non-haplotype-aware map because of the same reason you mentioned, which is that CHM13 is only a single haplotype. The non-haplotype-aware map picked up the hets at chr8:80,044,843 and chr8:121,388,618, but it called the one at chr8:21,025,201 a 9 kbp indel (and not a het). When I checked these by HiFi and ONT alignments, I found that both HiFi and ONT reads indicated that the one at chr8:21,025,201 was a het with about 50% read support for both structures in both datatypes. I asked our collaborator about this, and that's when they looked at the haplotype-aware version of the map, which had accurately called that one a het with two structures in the map.

So, basically, it seems like the non-haplotype-aware map did not call one of the locations a het, whereas the haplotype-aware map did, and the long-read alignments supported the "het" designation.

We included a supplementary table with the hets' location and read support in Supplementary Table 7. All of the hets are <10 kbp. If you align the HiFi and ONT reads to our chr8 assembly (and the rest of the CHM13 genome), you should be able to see the SV in the alignments and easily resolve them with a HiFi or ONT read.

zhoudreames commented 3 years ago

thanks