biocommons / uta

Universal Transcript Archive: comprehensive genome-transcript alignments; multiple transcript sources, versions, and alignment methods; available as a docker image
Apache License 2.0
62 stars 26 forks source link

RFC: Should we migrate to fully-justified alignments? #222

Open reece opened 5 years ago

reece commented 5 years ago

When aligning sequences with indels in repeat regions, the alignment is ambiguous. Most aligners implicitly right or left shuffled gaps, but this choice is arbitrary: the real region of ambiguity spans from the left extreme to the right extreme. The result is that it’s hard to know whether a particular region is within an ambiguity region.

Proposal: Based on work from NCBI, GA4GH adopted fully-justified normalization for the GA4GH VR specification.

Migrating UTA alignments to fully-justified would allow relatively simple tests for whether a variant overlaped a region of ambiguity.

github-actions[bot] commented 11 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 11 months ago

This issue was closed because it has been stalled for 7 days with no activity.

reece commented 8 months ago

This issue was closed by stalebot. It has been reopened to give more time for community review. See biocommons coding guidelines for stale issue and pull request policies. This resurrection is expected to be a one-time event.

github-actions[bot] commented 5 months ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.