cmdcolin / mafviewer

A JBrowse plugin to view multiple alignment format (MAF) files
26 stars 4 forks source link

Does not work well if there are overlapping alignment fragments #7

Open cmdcolin opened 7 years ago

cmdcolin commented 7 years ago

Currently assumes only one best alignment per genome position. If there are multiple, then they are plotted over each other. MAF files from UCSC pipeline work well, others including Ensembl can output some multiple best alignments per genomic position.

Best way to inspect is to just load the maf.bed.gz file as a CanvasFeatures track with BEDTabix store, which will show raw features.

rdhayes commented 7 years ago

Hi Colin,

We're exploring this track and data type for Phytozome. Do you have ideas on how to proceed here? For example, we're expecting whole genome duplications in some plants to produce overlapping alignments from two different G. max chromosomes aligned to a single P. vulgaris reference/target.

Things to consider: 1) The alignments data structure in mafviewer/js/Store/SeqFeature/MAF.js is keyed on species name prefix, but this could be modified to a JS hash of arrays or similar. 2) I'm thinking of adding a custom popup to prettify the display of the alignments table. This also removes the dependency on regular JBrowse codebase for rendering that data, which is in turn dependent on the simple data structure in use now. Not sure if a fmtDataField set of params would help here. 3) It's not immediately clear to me how to handle variable vertical space requirements for rendering more than one query fragment for a given block.

rdhayes commented 7 years ago

I am setting up a jbrowse-dev environment for you to view what we've got so far.

cmdcolin commented 7 years ago

This sounds like a great project! I can definitely try and assist. This particular issue has been somewhat tricky to solve "the right way". The ucsc pipeline I think tried to make one-to-one alignments and so the plugin works pretty well on that but perhaps we can find a way to display multiple alignments to a single region

cmdcolin commented 7 years ago

This is a nice guide that I found for sort of understanding ucsc concepts http://cs273a.stanford.edu/presentations/lecture15.pptx

The slide 16 "raw blastz alignments" was sort of a nice reference for how multiple overlapping blocks might look

cmdcolin commented 5 years ago

Note that raw blastz alignments I think is a different "chain view", maybe separate type of thing but sort of related