Closed niraito closed 2 years ago
Hello @niraito,
To include dead-end metabolites, one needs to not write the option --remove_dead_end
in the command line call. Taking the example you provide, it would be:
python -m rptools.rpextractsink iJO1366.xml iJO1366_rmdeadends_false.csv
Regarding the second part of your question, the sink extraction (as well as the dead end detection) is independant of MetaNetX. Basically, the sink extraction relies only on the SBML model, by listing the metabolites belonging to a given compartment (by defaut, the 'c' compartment, which corresponds to the cytosol in E. coli). For the dead-end detection, Flux Balance Analysis are used.
The MetaNetX deprecated relationships does not affect the pathway predictions, in the sens that, for a given set of reaction rules and sink, the predicted reactions and pathways remain the same than 3 years before (this is because we are using chemical structures to work, not IDs).
However, it's true that a lot of deprecated warnings now popup when browsing the MNX IDs, and make more difficult to explore results... (Releasing a new datasets is something plan in the future).
Regards, Thomas
Hello again, dear @tduigou,
Thank you for helping me with the --remove_dead_end
argument :)
However, I still have another question:
python -m rptools.rpextractsink iJO1366.xml iJO1366_sink_20221018.csv --compartment_id c
# number of cytosolic compounds: 808 (without header)
wc -l iJO1366_sink_20221018.csv
809 iJO1366_sink_20221018.csv
python -m rptools.rpextractsink iJO1366.xml iJO1366_sink_20221018_rmde.csv --compartment_id c --remove_dead_end
# number of cytosolic compounds, dead ends removed: 705 (without header)
wc -l iJO1366_sink_20221018_rmde.csv
706 iJO1366_sink_20221018_rmde.csv
From the sbml file, I can extract cytosolic compounds (by MNXM ids, not according to their chemical structures) manually:
# to extract list of species section:
cat iJO1366.xml | sed '/listOfSpecies/,$!d' > iJO1366_listOfSpecies_sed_intro.tmp
tac iJO1366_listOfSpecies_sed_intro.tmp | sed '/\/listOfSpecies/,$!d' | tac > iJO1366_listOfSpecies_sed_outro.tmp
cat iJO1366_listOfSpecies_sed_outro.tmp | grep 'MNXM' | sort -u | wc -l
1136
There are 1136 unique compounds or MNXM ids.
cat iJO1366_listOfSpecies_sed_outro.tmp | grep 'compartment="c"' | wc -l
1039
1039 compounds (or MNXM ids) out of 1136 are cytosolic.
But in the sink, there were 808 cytosolic compounds (or MNXM ids) (or 705, when dead ends are removed). Why do you think there is this difference? Could you help me find the reason I got them different in number?
Thank you for your time and patience!
Best regards, Nilay
Hi @niraito,
The reason why there are fewer compound is that only compounds having an InChI structure is outputted into the sink file.
The inchi is assigned by looking into the MNX database, using the MNX ID as the query. (Here, I realized I was mistaken when I was saying that the sink extraction was independent of the MNX database.) The MNX database is actually used for looking for this MNX ID to inchi relationship.
Best wishes, Thomas
Hello again, dear @tduigou,
I checked the InChI columns of the sinks both extracted by rptools
and the one that I retrieved from the sbml file (all MNXM IDs). And I saw that some of the MNXM compounds that I retrieved from the sbml file do not have InChI structures while all compounds in the sink extracted by rptools
have InChI structures. Thank you for the explanation.
Also, I would like to use the output scopes of RetroPath2.0 in further steps of my research. What would you suggest to do about the deprecated compounds for now?
Thank you for your time and patience!
Best regards, Nilay
Hi @niraito,
I would say it all depends of what will be done after RP2. Of course, if only chemical structure provided by the inchi matters, not the MNX IDs, then the status of deprecated are not important. If the MNX IDs are needed to establish crosslinks with other databases, then playing with the chem_xref files provided by MNX might be usefull find the links between deprecated and today's IDs.
Best wishes, Thomas
Dear @tduigou,
Thank you for your time, patience, and guidance! I will need the crosslink information. Therefore, I will work on the chem_xref and other files from MNX.
Best regards, Nilay
Hello,
I was trying to extract sink from different sbml models. When I compared the sink extracted by
rptools
and the compounds with MNXM IDs I extracted from the model manually, I noticed that some cytosolic compounds are not included in the sink byrptools
. This might be due to there is no flux for these compounds. And,--remove_dead_end
option isTrue
as default. However, I could not make it work assigning it to any form offalse
.The response was always this depending on the "false" argument:
How can I try it without removing dead ends?
I have another related question: I thought the compounds that are not included in the sink might be deprecated.
RetroRules
uses MNXref v3.0. The current release of MetaNetX is MNXref 4.4. Doesrptools
also use MNXref v3.0? There are many deprecation incidents with each new release. When I go to the MetaNetX page of a compound in the pathway resulting from RetroPath2.0, it is highly probable to see the compound is deprecated. I thought that I could update the compounds to their newest MNXM IDs. But the deprecations with each new release cause paths of IDs for the same compound.The mappings are "one-to-one, many-to-one (merge) and one-to-many (split)" as described in the Search/Download MNXref namespace page, and also can be seen below.
Sometimes I observe that a compound deprecates into something very different (in my opinion). I asked about this issue to MetaNetX with an e-mail. But what do you think about the deprecations and how does it affect the pathway predictions? Does it have a relation with the reason that some compounds are not included in the sink formed by
rptools.extractsink
?