baoxingsong / AnchorWave

Sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism and whole-genome duplication variation
MIT License
151 stars 19 forks source link

The problem of 'anchorwave proali' #61

Open xuxingyubio opened 11 months ago

xuxingyubio commented 11 months ago

Thank you for the development of the software Anchorwave! Here I have some problems.

I now want to use anchorwave to align my assembled contigs to the reference genome. I want to know if there are some problems in my assembly, such as a position occurs in two contigs, when I use anchorwave proali-R 1-Q 1, facing the same position, how will it return the alignment information of anchor.

baoxingsong commented 11 months ago

Sorry, but I could not understand your question.

xuxingyubio commented 11 months ago

I don't know if the following description is clear. Now I have assembled a sample of chr1, which contains multiple contigs. I want to use Anchorwave to align my assembled contigs to the reference genome's chr1. When there are errors in the assembly, such as overlaps between contigA and contigB, which contig will the anchors in these overlap regions be reported to when I use anchorwave proali-R 1-Q 1?

baoxingsong commented 11 months ago

The one with higher continuity would be preferred.

xuxingyubio commented 11 months ago

Thank you for your reply. I recently simulated the generation of exon246-256 as anchor. When usinganchorwave proali, I found a jump from exon248 to exon255. At the same time, the score of exon256 is very low, and I don't seem to find some parameters to control the score. And how to avoid this not very correct deletion (only one alignment of exon255 supported, exon266 had a very low score).

block end

block begin

5 3701036 3704809 184 41257 45028 - exon246 9 1 5 3704810 3712371 184 33695 41256 - interanchor 9 NA 5 3712372 3714572 184 31494 33694 - exon247 9 0.997706 5 3714573 3724994 184 21071 31493 - interanchor 9 NA 5 3724995 3725179 184 20886 21070 - exon248 9 1 5 3725180 3838681 184 10959 20885 - interanchor 9 NA 5 3838682 3839877 184 10065 10958 - exon255 9 0.976999 5 3839878 3846559 184 3018 10064 - interanchor 9 NA 5 3846560 3847522 184 1739 3017 - exon256 9 0.661561

xuxingyubio commented 11 months ago

It is possible that there are some sequences at the end of 184 (the sequence between exon248 and exon249) similar to the anchor sequence far from this position.