Open hgscott opened 1 year ago
To start with I made an ALT subset of the E. coli Central Metabolism map.
This is the Central Metabolism Map of E. coli, it uses BiGG IDs:
So I converted the map into ModelSEED IDs using the code in convert_ecoli_to_modelseed.py
and got this:
That didn't handle the periplasm compartment at all, so I allowed metabolites to use their normal ID + _p:
Then I used the "Update names and gene reaction rules using model" to show what reactions are actually in the ALT model:
So I deleted everything in red and started adding reactions that are present that aren't on the map, like exchange reactions:
Having the map use a periplasm compartment is messing up a lot of things, metabolites are shown that aren't actually there.
Daniel Sher says he needs to see human-readable names on the map itself, in addition to in the hover box. Also wants more labels for pathways, i.e. the TCA cycle.
Daniel Segrè says to start by putting all reactions on a map in arbitrary/algorithm decided locations, may be ugly but will give us something to start with and collaborate on.
One thing that makes randomly placing reactions harder than I thought it would be are the additional types of nodes: midmarkers and multimarkers. The multimarker nodes represent a common internal node for multiple reactants or products, and the midmarker nodes are the midpoint of the reaction and serve as an anchor for the reaction label.
Reactions are represented explicitly, and each reaction can contain several segments, so there are 6 valid segment classes:
(Got this information from: https://gist.github.com/djinnome/aead21ef178fc0fd691d5fa71ad5ddb1#file-1-converting-cytoscape-json-to-escher-json-ipynb)
So I came up with this pseudocode:
# For a reaction
# Are there multiple reactants
# If there is only one reactant, can go straight from metabolite to midmarker
# If there are multiple reactants, need to go from metabolite to multimarker to midmarker
# Are there multiple products
# If there is only one product, can go straight from midmarker to metabolite
# If there are multiple products, need to go from midmarker to multimarker to metabolite
So I'm thinking I'll just put each reaction and all of its metabolites on the map but don't connect any of the reactions (i.e. each reaction is it's own minigraph).
But will that really speed anything up? If nothing is connected, what's the point of having anything to start with?
Instead of doing fully arbitrary locations and plotting each reaction individually- use Daniel's code that did algorithm placing with some duplicating of highly connected metabolites: https://colab.research.google.com/drive/1lM2UE2UzR5NmmvzO5K0TUGgXKog8aWlC?usp=sharing#scrollTo=HgQB_1TKXiIf
I gave up on trying to figure out the logic of where to put the nodes, I did a point-and-click building of a map just putting the reactions on individually in the order they pop-up, and got this:
I will pause on the map layout here and work on automatically adding human-readable labels for the reactions/metabolites.
I was able to add the labels easily enough, but I'm unsure how much of a help they are given their size and the redundancy with the hover function:
I did notice that the code doesn't seem to handle the labels for metabolites on the left well. I think the xy coordinates were for the node, not the label. I.e. look how far apart the cpd00050_c0 is from the FMN text label.
Changing to label_x
and label_y
definitely helped, but now the offsets are off:
I removed the x offset and increased the y offset slightly, and everything looks good- on both right and left
Daniel said to add the human-readable labels to the ALT subset of the E. coli model, and asked if I can run Escher-FBA on that model/map.
I added the labels (the map still has reactions not present in the model) and it's a total mess.
I was able to run Escher-FBA through the web interface and see the fluxes, though setting limits on the reactions is hard.
Daniel says to make it an option to add only the metabolite names no reactions.
That did make it a lot less messy:
Daniel's other ideas were:
DSher wants any map to have amino acids on any map I make.
Osnat has a map of TCA to different amino acid degradation pathways to show how AAs feed into the TCA, made using BioCyc pathway collages. Emma made a very pretty version for a manuscript. Shows that different AAs go into different parts of the cycle. BioCyc predicts that almost all Alteromonas species can use almost all AAs, but not all genes are actually present. Have experimental data that shows growth on mixtures of AA, but not on individual AAs.
We want an Escher map (or multiple) to help with curation of the model.