C-CoMP-STC / GEM-mit1002

Creative Commons Attribution 4.0 International
0 stars 0 forks source link

Make an Escher map #17

Open hgscott opened 1 year ago

hgscott commented 1 year ago

We want an Escher map (or multiple) to help with curation of the model.

hgscott commented 1 year ago

To start with I made an ALT subset of the E. coli Central Metabolism map.

This is the Central Metabolism Map of E. coli, it uses BiGG IDs: saved_map (6)

So I converted the map into ModelSEED IDs using the code in convert_ecoli_to_modelseed.py and got this: saved_map (1)

hgscott commented 1 year ago

That didn't handle the periplasm compartment at all, so I allowed metabolites to use their normal ID + _p: saved_map (2)

hgscott commented 1 year ago

Then I used the "Update names and gene reaction rules using model" to show what reactions are actually in the ALT model: saved_map (3)

hgscott commented 1 year ago

So I deleted everything in red and started adding reactions that are present that aren't on the map, like exchange reactions: saved_map (7)

hgscott commented 1 year ago

Having the map use a periplasm compartment is messing up a lot of things, metabolites are shown that aren't actually there.

hgscott commented 1 year ago

Daniel Sher says he needs to see human-readable names on the map itself, in addition to in the hover box. Also wants more labels for pathways, i.e. the TCA cycle.

hgscott commented 1 year ago

Daniel Segrè says to start by putting all reactions on a map in arbitrary/algorithm decided locations, may be ugly but will give us something to start with and collaborate on.

hgscott commented 1 year ago

One thing that makes randomly placing reactions harder than I thought it would be are the additional types of nodes: midmarkers and multimarkers. The multimarker nodes represent a common internal node for multiple reactants or products, and the midmarker nodes are the midpoint of the reaction and serve as an anchor for the reaction label.

Reactions are represented explicitly, and each reaction can contain several segments, so there are 6 valid segment classes:

  1. metabolite --> multimarker (For multiple reactants of a reaction)
  2. metabolite --> midmarker (for single reactants of a reaction)
  3. multimarker --> midmarker (internal segment)
  4. midmarker --> multimarker (internal segment)
  5. multimarker --> metabolite (a segment for each of multiple products of a reaction)
  6. midmarker --> metabolite (a segment for single products of reaction)

(Got this information from: https://gist.github.com/djinnome/aead21ef178fc0fd691d5fa71ad5ddb1#file-1-converting-cytoscape-json-to-escher-json-ipynb)

hgscott commented 1 year ago

So I came up with this pseudocode:

# For a reaction
# Are there multiple reactants
    # If there is only one reactant, can go straight from metabolite to midmarker
    # If there are multiple reactants, need to go from metabolite to multimarker to midmarker
# Are there multiple products
    # If there is only one product, can go straight from midmarker to metabolite
    # If there are multiple products, need to go from midmarker to multimarker to metabolite
hgscott commented 1 year ago

So I'm thinking I'll just put each reaction and all of its metabolites on the map but don't connect any of the reactions (i.e. each reaction is it's own minigraph).

But will that really speed anything up? If nothing is connected, what's the point of having anything to start with?

hgscott commented 1 year ago

Instead of doing fully arbitrary locations and plotting each reaction individually- use Daniel's code that did algorithm placing with some duplicating of highly connected metabolites: https://colab.research.google.com/drive/1lM2UE2UzR5NmmvzO5K0TUGgXKog8aWlC?usp=sharing#scrollTo=HgQB_1TKXiIf

hgscott commented 1 year ago

I gave up on trying to figure out the logic of where to put the nodes, I did a point-and-click building of a map just putting the reactions on individually in the order they pop-up, and got this: saved_map (8)

hgscott commented 1 year ago

I will pause on the map layout here and work on automatically adding human-readable labels for the reactions/metabolites.

hgscott commented 1 year ago

I was able to add the labels easily enough, but I'm unsure how much of a help they are given their size and the redundancy with the hover function: saved_map (8)

hgscott commented 1 year ago

I did notice that the code doesn't seem to handle the labels for metabolites on the left well. I think the xy coordinates were for the node, not the label. I.e. look how far apart the cpd00050_c0 is from the FMN text label. Screenshot 2023-07-26 at 12 47 20 PM

hgscott commented 1 year ago

Changing to label_x and label_y definitely helped, but now the offsets are off: Screenshot 2023-07-26 at 12 50 16 PM

hgscott commented 1 year ago

I removed the x offset and increased the y offset slightly, and everything looks good- on both right and left Screenshot 2023-07-26 at 12 53 09 PM

hgscott commented 1 year ago

Daniel said to add the human-readable labels to the ALT subset of the E. coli model, and asked if I can run Escher-FBA on that model/map.

hgscott commented 1 year ago

I added the labels (the map still has reactions not present in the model) and it's a total mess.

Screenshot 2023-07-26 at 2 48 45 PM
hgscott commented 1 year ago

I was able to run Escher-FBA through the web interface and see the fluxes, though setting limits on the reactions is hard. Screenshot 2023-07-26 at 2 52 16 PM

hgscott commented 1 year ago

Daniel says to make it an option to add only the metabolite names no reactions.

hgscott commented 1 year ago

That did make it a lot less messy: Screenshot 2023-07-26 at 3 07 31 PM

hgscott commented 1 year ago

Daniel's other ideas were:

hgscott commented 1 year ago

DSher wants any map to have amino acids on any map I make.

Osnat has a map of TCA to different amino acid degradation pathways to show how AAs feed into the TCA, made using BioCyc pathway collages. Emma made a very pretty version for a manuscript. Shows that different AAs go into different parts of the cycle. BioCyc predicts that almost all Alteromonas species can use almost all AAs, but not all genes are actually present. Have experimental data that shows growth on mixtures of AA, but not on individual AAs.