Open edkerk opened 6 years ago
Another consideration. If you look at the example from yeast consensus network, metabolite names are appended with [compartment]
in the cobrapy-version. This should not be done in this function, as it should represent how the model is presented in MATLAB.
@edkerk sounds good! I will write the writeYaml
function. I assume we should continue the work on branch fix/export_functions
?
Yups, I'll push devel to it, to ensure latest changes. (done: #80)
A writeYaml function has been added in branch fix/export_functions: 23bb1f737a6f1cf33981d035ae5f33a62e842fdb
@BenjaSanchez a few points:
[x] note where publications are stored: pubmed ids can be stored in rxnMiriams
(as pmid
), while non-pubmed ids can be stored in rxnReferences
. Neither should be stored in rxnNotes
, this field is for generic notes
[x] note that not all fields are compulsory: if a model has rxnMiriams
field, the function requires that is also has ec-codes
, rxnKEGG
and rxnNotes
, according to line 110: if ~isempty(model.eccodes{pos}) || ~isempty(model.rxnKEGG{pos}) || ~isempty(model.rxnNotes{pos})
fails if one of them doesn't exist.
@edkerk RE the first point, this is because I was using a COBRA structure, which does not support rxnMiriams
, it can only store pmids
either in rxnReferences
or rxnNotes
(see here), and at the moment they are stored in rxnNotes
which get transfered to the same field by ravenCobraWrapper
. Should I move those fields in the yeast model then to rxnReferences
? In the case of a generic RAVEN model, there will not be any issue once we implement the generic extractMiriam
we discussed in Gitter.
RE the second point, that I can also address once we have the improved extractMiriam
Looking at the specification of Cobra model that you linked, they also indeed be in rxnReferences
, for both Cobra and Raven. Also, there it specifies that rxnReferences
is "Column Cell Array of Strings" and "of references for each corresponding reaction.", so not necessarily pubmed IDs.
You're right, an improved extractMiriam will probably change the code a bit anyway.
Irrespective, pubmed IDs don't need to be prefixed with pmid:
. This is done for ChEBI as it really is part of the identifier (here, CHEBI:17234), not because it would otherwise be a number-only. Also compare for instances ChEBI and KEGG compound on identifiers.org, pay attention to the identifier pattern.
actually now testing with COBRA not even rxnReferences
can be used to store things, as after a I/O cycle everything there gets transfered to rxnNotes
. So pmid's will continue to be stored in the yeast model in rxnNotes
with the format pmid:XXX; pmid:YYY
, and I will include in ravenCobraWrapper
a section for detecting these cases and sending them to rxnMiriams
Perhaps then also start an issue at Cobra, because they do state that rxnReferences
and rxnNotes
are separate fields?
@BenjaSanchez
actually now testing with COBRA not even rxnReferences can be used to store things, as after a I/O cycle everything there gets transfered to rxnNotes. So pmid's will continue to be stored in the yeast model in rxnNoteswith the format pmid:XXX; pmid:YYY, and I will include in ravenCobraWrappera section for detecting these cases and sending them to rxnMiriams
Could you give me an example for this in the COBRA toolbox?
In general: PMIDs should (imo) be added via MIRIAM annotations. Pubmed is listed on registry.org and is parsed by the COBRA toolbox SBML parser into the rxnReferences
field, if it is correctly annotated (i.e. using the isDescribedBy
bioql qualifier).
We try to put into notes things that are either invalid IDs (as mentioned above, PMID1234
is not a valid Pubmed id, while 1234
is, so if an invalid PMID is provided, that is likely to go into the notes field, but not into rxnReferences during IO cycles.
As far as I know the yaml format from RAVEN and cobrapy are not identical.
RAVEN should support writing a cobrapy-compatible yaml format, as this format is very concise.
writeYaml
function that writes a text file, and parses through the model structure.exportForGit
to support new writeYaml functionA few considerations:
For each metabolite, include:
mets
metNames
metComps
inchis
metFormulas
metMiriams
(any)metCharges
unconstrained
rxnFrom
(?)For each reaction, include:
rxns
rxnNames
rxnComps
grRules
subSystems
eccodes
rxnMiriams
(any)rxnNotes
rxnConfidenceScores
For each compartment, include:
comps
compNames
compOutside
compMiriams
For each gene, include:
genes
geneComps
geneMiriams
geneShortNames
But of course only write those fields if they are present in the model.