Closed zmx21 closed 3 years ago
Currently ShoRAH runs only in the coordinate space of the reference genome, and doesn't support inserts (yet). It's saddly not a feature we plan to address in the immediate future.
As a workaround, I would suggest trying to remap the alignement to a different reference that would include the insert, so that these haplotype will show 'gggacatca' and the remaining will show as 'gg--catca'.
The package smallgenomeutilities also developed by colleagues here at the CBG-ETHZ, has a tool named convert_reference
that can help you remap alignements to different references.
Tell me if this helps.
I see, thanks for the prompt reply and helpful suggestion! I'll try the workaround you mentioned.
Hello,
I've noticed that in cases where a deletion follows an insertion, the deletion is incorporated (by returning "-") but the insertion sequence is not appended?
For example, in the following example shown in IGV, there's an insertion (GA) followed by a deletion (GG) on some reads:![INDEL_Example](https://user-images.githubusercontent.com/13065309/97864821-8187bd80-1d09-11eb-9607-5afa226d7cc0.png)
For these haplotypes, Shorah returned the following sequence: ggcat--ca
However, if I'm not mistaken, the actual sequence should be: gggacatca
Any advice would be greatly appreciated, maybe this should have been taken care of in pre-processing? Thanks!