bioforensics / MicroHapulator

Tools for empirical microhaplotype calling, forensic interpretation, and simulation.
https://microhapulator.readthedocs.io/
Other
6 stars 1 forks source link

Implement `mhpl8r marker` view #65

Closed standage closed 5 years ago

standage commented 5 years ago

There is no convenient way to view a summary of a microhaplotype marker at the moment. Although this could potentially be a feature of MicroHapDB, I think it makes more sense as part of MicroHapulator (since the reference genome is bundled with the software, and the LocusContext class will lend consistency across the different views). I suggest implementing this as the mhpl8r marker subcommand, the output of which could look something like this.

-----------------------------------------------------------[ MicroHapulator ]---
MHDBM000038    a.k.a. mh11KK-191, SI664600W

Marker Definition (GRCh38)
    Marker extent
        - chr11:100009431-100009619
    Target amplicon
        - chr11:100009351-100009700
    Constituent variants
        - rs12421109 (pos=100009431, offset=80)
        - rs12289401 (pos=100009492, offset=141)
        - rs12420819 (pos=100009550, offset=199)
        - rs770566   (pos=100009620, offset=269)
    Observed alleles
        - C,A,A,T
        - C,A,G,T
        - C,G,A,T
        - C,G,G,T
        - T,A,A,C
        - T,A,A,T
        - T,G,A,T
--------------------------------------------------------------------------------
                                                                                *                                                            *                                                         *                                                                     *
TACTGTCTGTAAAGGTATTTCCCCAGAAAAATGGGAAGTGTTTCAAGAGAACCCATAGGGAAACAAAGGTATGTAAAGGCTTGGGTAACACAGCAAAGTGTAAAAAAAAAAATGGAGGGGGATTAATTAGTTGGAAAGAAAAGACTGGTTTAGACATATGGAAGGTTATTATCAAGAGATTGCTCACAAGCTGTTCTCCAGTTTCAACAAGGGAGGGAGTGAAATTCCAACAGTATGGGATTAAACTAGAAGAATTAGATTGGGTGTAATGTAGAAGTTCATGAAAAAATGAGTTGTTACAGGGAAACTGGGTTATCAGGAGAGGCAGTGGAAGCTCTTTGCCTGAACTG
................................................................................C............................................................A.........................................................A.....................................................................T................................................................................
................................................................................C............................................................A.........................................................G.....................................................................T................................................................................
................................................................................C............................................................G.........................................................A.....................................................................T................................................................................
................................................................................C............................................................G.........................................................G.....................................................................T................................................................................
................................................................................T............................................................A.........................................................A.....................................................................C................................................................................
................................................................................T............................................................A.........................................................A.....................................................................T................................................................................
................................................................................T............................................................G.........................................................A.....................................................................T................................................................................
standage commented 5 years ago

The sequences spanning and flanking each marker are now distributed with MicroHapDB, making it possible to implement this feature in MicroHapDB, where it arguably belongs. Completed in https://github.com/bioforensics/MicroHapDB/pull/27.