graph-genome / graph_summarization

Browser for Graph Genomes built with VG based on Graph Summarization to provide semantic zoom. As a user zooms in on a graph genome, the topology becomes more complex. Provides visualization for variation within a species of plant or animal. Designed to scale up to thousands of specimens and provide useful visualizations.
Other
7 stars 1 forks source link

I19 haploblocker #22

Closed josiahseaman closed 5 years ago

josiahseaman commented 5 years ago

This pull request rearranges the file structure to bring everything together under a unified django app structure. Torsten and Josiah's work on HaploBlocker is contained in the HaploBlocker app, while Toshiyuki and Josiah's work on Graph structures is contained in the Graph app. HaploBlocker is successfully using all 3 of the basic graph simplifications.

Currently, HaploBlocker and Graph are not using the same data objects. The next step will be a HaploBlocker refactor to use database objects defined in Graph.

Travis is currently not passing because the Django path changes were never resolved. However, unit tests run from PyCharm all pass (or are skipped).

ekg commented 5 years ago

In vg graphs we are building a positional index of the paths in the graph. The node IDs aren't stable or something that we use for positional organization. What matters are the actual sequences the graph was made from. One way to provide this for SNP data is to make each of the alternate alleles in the graph a path.

On Wed, Aug 14, 2019, 17:12 Torsten Pook notifications@github.com wrote:

@tpook92 commented on this pull request.

In HaploBlocker/haplonetwork.py https://github.com/graph-genome/vgbrowser/pull/22#discussion_r313930187:

  • return next(iter(iterable))
  • +class Point:

  • def init(self, snp, bp=0):
  • self.snp, self.bp = snp, bp
  • @property
  • def window(self):
  • return self.snp // BLOCK_SIZE
  • +class Node:

  • def init(self, ident, start, end, specimens=None, upstream=None, downstream=None):
  • self.ident = ident
  • self.start = start #Point()

Yes these are SNP-numbers. This originated from us using a SNP-chip data set to build the original graph. The only time they are actually used is for generating nodeIDs and to make it easier for us to check the results.

Do we need some kind of position for the visualization? BP-position might be different based on specimen or even multiple values for a specimens in case of CNV.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/graph-genome/vgbrowser/pull/22?email_source=notifications&email_token=AABDQEJV754H3AZVQPA3CZ3QEQOFNA5CNFSM4ILLZNE2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCBRYHRI#discussion_r313930187, or mute the thread https://github.com/notifications/unsubscribe-auth/AABDQEOZNE2F5HSHFM5XFNLQEQOFNANCNFSM4ILLZNEQ .