biocommons / hgvs

Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`
https://hgvs.readthedocs.io/
Apache License 2.0
233 stars 94 forks source link

Context view improvements #741

Open andreasprlic opened 3 weeks ago

andreasprlic commented 3 weeks ago

The hgvs library has a built in (text-based) visualization, which allows to build a view of the context of a variant with the alignment between the transcript and the reference genome. It can create representations similar to this:

                                              v                               NC_000010.10:g.64572045dupT
NC_000010.10 g 64572025 > ACTCAGGGAGTGATTTTTTTTCTCCATAATAAGGCAACCCA          > 64572065 NC_000010.10:g.64572045dupT
NC_000010.10 g 64572025 < TGAGTCCCTCACTAAAAAAAAGAGGTATTATTCCGTTGGGT          < 64572065 NC_000010.10:g.64572045dupT
                          |||||||||||||-|||||||||||||||||||||||||||          13=1D27=
NM_000399.3  n     2670 < TGAGTCCCTCACT-AAAAAAAGAGGTATTATTCCGTTGGGT          <     2709 NM_000399.3:n.2696dupA
NM_000399.3  c      902 <                                                    <      941 NM_000399.3:c.*928dupA

At the moment this visualization is flagged as "experimental". It also requires the uta_align package for re-aligning the sequences.

Describe the solution you'd like

It would be nice to expand on this and add a few more features:

Describe alternatives you've considered

The question is mostly if we want to have better tooling around visualizing as part of the main hgvs module, or perhaps as a separate tool. Since we already have context.py, perhaps it fits into the main library.

jsstevenson commented 3 weeks ago

Maybe this is a bit too out there, but for those publishing work to Jupyter notebooks (probably a lot of us), you can also supply special repr methods that incorporate HTML/CSS (Pandas dataframes are probably the most popular example of this). That could go a long way towards improving readability, for that context.

andreasprlic commented 3 weeks ago

Actually that would be nice. Being able to show the context in a notebook would be a good feature to have.