Closed ChristianLieven closed 7 years ago
That is a good point and one that pops up every once in a while for discussion. There is some ongoing discussion about the meaning of the SBML spec regarding the notes field. SBML only says:
It is intended to serve as a place for storing optional information intended to be seen by humans.
and comparing to annotation
:
Whereas Notes is a container for content to be shown directly to humans, Annotation is a container for optional software-generated content not meant to be shown to humans.
The interpretation of the cobrapy maintainers in the past was that since notes should not be "consumed by a machine" it would not be written or read by cobrapy except for supporting the SBML 2 cobra annotations. The argument was that all annotation should go into the annotation tag as described in the spec. For the particular use case of DOIs annotation
this is the recommended solution. There is a MIRIAM tag for DOIs so you can just use that. For instance the following is valid SBML and would be read into model.metabolites.h_c.annotation
in cobrapy:
<annotation>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#" xmlns:bqbiol="http://biomodels.net/biology-
qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/">
<rdf:Description rdf:about="#M_h_c">
<bqbiol:is>
<rdf:Bag>
<rdf:li rdf:resource="http://identifiers.org/kegg.compound/C00080"/>
<rdf:li rdf:resource="http://identifiers.org/doi/10.1038/nbt1156"/>
</rdf:Bag>
</bqbiol:is>
</rdf:Description>
</rdf:RDF>
</annotation>
However, that only works for direct annotations and not for adding data. For instance if I want to add some other quantity to the species or reaction (confidence scores or charge in various conditions, etc.), there is no way to do that with annotations. This is a shortcoming of SBML IMHO. So I would be in favour of reading and writing the notes field. Could be just raw text of could be a dictionary that is read and written to <ul>
tags as you specified and is written into a <p>
tag if it's just a string. But that would depend on how others interpret the SBML spec here.
We are not aware of any existing schema or documentation of the annotation tags used in cobra. Our suggestion is to create a new repository under the opencobra organization. That way, any member of the opencobra community (most importantly of the Matlab COBRA Toolbox) can feel free to contribute to the schema, there can be versioned releases of the schema, and for the time being it can be hosted on https://opencobra.github.io/annotations/schema or whatever is decided for the name and URL.
We would then implement in cobrapy whatever is dictated by the schema and there's a chance for other tools in the opencobra community to do the same.
Well, there is, of course, another way of storing confidence scores for reactions in a standard-compliant form. You could use Parameter
objects for this. These are objects in the listOfParameters
directly within the model and have an id
, optional name
and value
. In their id
you could prefix the reaction id
that confidence score is referring to. However, this would again not be the best solution of storing that sort of information because it is not obvious what these parameters are.
This issue was moved to opencobra/schema#4
Problem description
I am currently reconstructing a metabolic model, for which I am adding confidence scores, comments, and literature references in the notes attribute of reactions, metabolites and genes. The importance of confidence scores and related qualitative annotation parameters is discussed in the publications linked above.
I tried importing simple noted by adding the following notes field to the RECON1 model from BiGG. `
This is a TEST
I am wondering if COBRApy is able to import this.
` I was quite surprised that the RECON1 model did not contain the confidence scores upon which some of the results of this research are based on.
I was not able to find the keywords 'confidence', 'score' or 'confidence_score' in cobra.io.sbml nor cobra.io.sbml3. If I saw that right the legacy import looks specifically for charge, GPR, and subsystem in the notes field but doesn't account for the confidence score.
Code Sample
You can find my modified example SMBL3+FBC RECON1 file here. The modification is at R_EX_dopa_e.
Discussion
It seems like the community hasn't decided yet what exactly the notes field should contain and how it should be formatted. Personally, I'd find most useful if there was a clever way of allowing both, short human-readable comment entries, as well as optional, but specifically related machine-readable DOI-styled literature references. In the model object, I suppose this could be a nested dictionary looking something like this:
some_model.reaction.SOME_RXN.notes = {"confidence_score":{"value":4, "reference":"some_doi"}}
Based on the referenced publications above, another useful key of the notes-field/attribute would be a simple 'comment' option, which would be limited in length (50 chars? 70 chars? 80 chars?).
some_model.reaction.some_metabolite.notes = {"comment":{"value":"Short string outlining a hypothesis or specific decision for this metabolite", "optional_reference":"some_doi"}}
I don't doubt that there could be a feasible, simple implementation on the python side of things, however I am unfamiliar with the options on the xml specifically SMBL side. A notes field according to the SMBL specifications is allowed to contain...
...which seem pretty straight-forward, namely the notes field ...
Hence, I think a solution here could be to use
<ul>
from HTML?What do you think?