sbmlteam / sbml-specifications

The specification documents for SBML.
6 stars 3 forks source link

linking rdf:Bags #130

Closed sbmlsecretary closed 2 years ago

sbmlsecretary commented 16 years ago

I would like to modify a tiny bit the annotation framework described in the spec, to allow rdf:ID and rdf:about attributes on rdf:Bag element. This is proper RDF, and that will provide a mechanism to precise an annotation with another one. Examples of things much asked for by the community are: * describe protein modifications on species * put evidence codes

The following says that the species is a phosphorylated form of the protein:

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/"&gt; <rdf:Description rdf:about="#_000004" > <bqbiol:isVersionOf> <rdf:Bag rdf:ID="toto" > <rdf:li rdf:resource="urn:miriam:uniprot:P04551"/> </rdf:Bag> </bqbiol:isVersionOf> <bqbiol:isVersionOf> <rdf:Bag rdf:about="#toto"> <rdf:li rdf:resource="urn:miriam:obo.mod:MOD%3A00047"/> </rdf:Bag> </bqbiol:isVersionOf> </rdf:Description> </rdf:RDF>

The following says that the species has been located in the lysosome, by a cell fractionation assay described in a publication.

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/"&gt; <rdf:Description rdf:about="#_000004" > <bqbiol:occursIn> <rdf:Bag rdf:ID="toto" > <rdf:li rdf:resource="urn:miriam:obo.go:GO%3A0005764"/> </rdf:Bag> </bqbiol:occursIn> <bqbiol:isDescribedBy> <rdf:Bag rdf:about="#toto"> <rdf:li rdf:resource="urn:miriam:pubmed:1111111"/> </rdf:Bag> </bqbiol:isdescribedBy> <bqbiol:isDescribedBy> <rdf:Bag rdf:about="#toto"> <rdf:li rdf:resource="urn:miriam:obo.eco:ECO%3A0000004"/> </rdf:Bag> </bqbiol:isdescribedBy> </rdf:Description> </rdf:RDF>

Reported by: lenov

Original Ticket: "sbml/sbml-specifications//128":https://sourceforge.net/p/sbml/sbml-specifications//128

sbmlsecretary commented 16 years ago

Logged In: YES user_id=862059 Originator: NO

Overall I think this is the right approach.

However, we decided against correcting any design mistakes made in the RDF specs. We decided against those changes to avoid additional complications in the implementation. If we allow these additional attributes and do not ask for support we effectively create an individual exchange language.

Please node that if we allow the about to refer to other ids we can annotate any element within an SBML document not just the parent object of the annotation element. This is far to complicated for L2V4 and we should keep this in mind for the proposed L3 MIRIAM compliant annotation extension.

On the other hand if we open up the RDF design we should remove the RDF/XML limitations simultaneously.

Original comment by: shoops

sbmlsecretary commented 16 years ago

Logged In: YES user_id=862059 Originator: NO

I am accepting this issue as valid.

Original comment by: shoops

sbmlsecretary commented 16 years ago

Logged In: YES user_id=1045203 Originator: YES

> However, we decided against correcting any design mistakes made in the RDF specs.

Yes and no. We decided not to change the global approach to RDF, but we are still fixing bugs, cf issue 1874879. Fixing this issue results in fiddling with attributes already

> If we allow these additional attributes and do not ask for support we effectively create an individual exchange language.

Sorry, I do not understand.

> Please node that if we allow the about to refer to other ids we can annotate any element within an SBML document not just the parent object of the annotation element.

Is-it true? I am not sure, because the SBML is not an RDF document. We will test it. We would have to modify the URI to use absolute ones. But anyway, we can still restrict the scope of the about to the current RDF block. After all, what we do at the moment is way more strict.

Original comment by: lenov

sbmlsecretary commented 16 years ago

> Please node that if we allow the about to refer to other ids we can > annotate any element within an SBML document not just the parent object of > the annotation element.

Yes, by using Nicolas's way, I guess it is possible to refer to elements outside the RDF block they are declared.

But this is highly dependant on how a tool parses an SBML document. I believe, so far, most of the tools handling annotations only consider them independently (only linked to the component they are child of) and not as a whole set. In this case, the above modification of the RDF should not influence too much already existing tools: they will definitively not mix annotations from different blocks. But I agree, modifications will be needed to support this.

Moreover, for safety, we can always add a restriction in the specifications.

Original comment by: perkeo

sbmlsecretary commented 16 years ago

Original comment by: mhucka

sbmlsecretary commented 16 years ago

I am accepting this issue as valid.

Original comment by: sarahkeating

sbmlsecretary commented 16 years ago

I'm not sure we should make this change.

I can see why it would be useful

BUT ...

about 50% of my libsbml related queries relate to dealing with annotations and how to pull information out. I think allowing the each Bag element to potential be "about" a different object will cause major complications and since we envisage revisiting RDF for l3 I would vote with not doing this now.

Original comment by: sarahkeating

sbmlsecretary commented 16 years ago

I am accepting this issue as valid.

Original comment by: mhucka

sbmlsecretary commented 16 years ago

Is this crucial for supporting evidence codes? If so, then we must have it. There is already a big modeling effort underway that needs it.

Original comment by: mhucka

sbmlsecretary commented 16 years ago

Original comment by: lenov

sbmlsecretary commented 15 years ago

Graph generated from the first example.

Original comment by: shoops

sbmlsecretary commented 15 years ago

Graph generated from the second example.

Original comment by: shoops

sbmlsecretary commented 15 years ago

File Added: example2.png

Original comment by: shoops

sbmlsecretary commented 15 years ago

Concerning the first example given by Nicolas, what about using this syntax instead:

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/"&gt; <rdf:Description rdf:about="#_000004" > <bqbiol:isVersionOf> <rdf:Bag rdf:ID="toto" > <rdf:li rdf:resource="urn:miriam:uniprot:P04551"/> </rdf:Bag> </bqbiol:isVersionOf> </rdf:Description> <rdf:Description rdf:about="urn:miriam:uniprot:P04551"> <bqbiol:isVersionOf> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.mod:MOD%3A00047"/> </rdf:Bag> </bqbiol:isVersionOf> </rdf:Description> </rdf:RDF>

The corresponding graph seems correct (according to my understanding of what Nicolas wants to express).

Original comment by: perkeo

sbmlsecretary commented 15 years ago

About Nicolas' second example, I would write it that way:

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/"&gt; <rdf:Description rdf:about="#_000004"> <bqbiol:occursIn> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.go:GO%3A0005764" /> </rdf:Bag> </bqbiol:occursIn> </rdf:Description> <rdf:Description rdf:about="urn:miriam:obo.go:GO%3A0005764"> <bqbiol:isDescribedBy> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:pubmed:1111111" /> <rdf:li rdf:resource="urn:miriam:obo.eco:ECO%3A0000004" /> </rdf:Bag> </bqbiol:isDescribedBy> </rdf:Description> </rdf:RDF>

What do you think about this syntax?

Original comment by: perkeo

sbmlsecretary commented 15 years ago

The correct syntax for both examples is found below. Please note that this is just on possible notation.

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/"&gt; <rdf:Description rdf:about="#_000004"> <bqbiol:isVersionOf> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:uniprot:P04551" rdf:ID="toto"/> </rdf:Bag> </bqbiol:isVersionOf> </rdf:Description> <rdf:Description rdf:about="#toto"> <bqbiol:isVersionOf> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.mod:MOD%3A00047"/> </rdf:Bag> </bqbiol:isVersionOf> </rdf:Description> </rdf:RDF>

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/"&gt; <rdf:Description rdf:about="#_000004"> <bqbiol:occursIn> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.go:GO%3A0005764" rdf:ID="toto"/> </rdf:Bag> </bqbiol:occursIn> </rdf:Description> <rdf:Description rdf:about="#toto"> <bqbiol:isDescribedBy> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:pubmed:1111111"/> <rdf:li rdf:resource="urn:miriam:obo.eco:ECO%3A0000004"/> </rdf:Bag> </bqbiol:isDescribedBy> </rdf:Description> </rdf:RDF>

Original comment by: shoops

sbmlsecretary commented 15 years ago

I agree that Stefan gave another possible syntax. However, although it is a valid one, I'm not sure to fully understand why, when we already have an identifier for a resource (for example 'urn:miriam:uniprot:P04551' in the first example) it is useful to create a second one ('toto', still in the first example). Moreover, the addition of this identifier actually makes the graph[1] more complicated (than the relationships actually are)! See the attached pictures: graph_example1_Stephan.png (Stephan's syntax) and graph_example1_Camille.png (the syntax I gave previously). Of course, the same apply for the second example. So, as I'm not an expert of RDF, I would be happy to understand what are the pros of such a syntax. Thank you.

[1] generated by the RDF W3C Online Validator: http://www.w3.org/RDF/Validator/

Original comment by: perkeo

sbmlsecretary commented 15 years ago

Ok, I wanted to attach the two graphs, but it seems I can't... Anyway, just copy/paste the piece of RDF to http://www.w3.org/RDF/Validator/, don't forget to select 'triples and graph' in the options and finally click 'Parse RDF'.

Original comment by: perkeo

sbmlsecretary commented 15 years ago

SBML Compliant encoding of example 2:

<rdf:RDF xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns\#"&gt; <rdf:Description rdf:about="#_000004"> <bqbiol:occursIn> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.go:GO%3A0005764"/> </rdf:Bag> </bqbiol:occursIn> </rdf:Description> <rdf:Statement> <bqbiol:isDescribedBy> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.eco:ECO%3A0000004"/> <rdf:li rdf:resource="urn:miriam:pubmed:1111111"/> </rdf:Bag> </bqbiol:isDescribedBy> <rdf:object rdf:resource="urn:miriam:obo.go:GO%3A0005764"/> <rdf:predicate rdf:resource="http://biomodels.net/biology-qualifiers/occursIn"/&gt; <rdf:subject rdf:resource="#_000004"/> </rdf:Statement> </rdf:RDF>

Original comment by: shoops

sbmlsecretary commented 15 years ago

SBML compliant encoding of example 1:

<rdf:RDF xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns\#"&gt; <rdf:Description rdf:about="#_000004"> <bqbiol:isVersionOf> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:uniprot:P04551"/> </rdf:Bag> </bqbiol:isVersionOf> </rdf:Description> <rdf:Statement> <bqbiol:isVersionOf> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.mod:MOD%3A00047"/> </rdf:Bag> </bqbiol:isVersionOf> <rdf:object rdf:resource="urn:miriam:uniprot:P04551"/> <rdf:predicate rdf:resource="http://biomodels.net/biology-qualifiers/isVersionOf"/&gt; <rdf:subject rdf:resource="#_000004"/> </rdf:Statement> </rdf:RDF>

Original comment by: shoops

sbmlsecretary commented 15 years ago

I agree with the proposed change and that it should be done.

Original comment by: shoops

sbmlsecretary commented 14 years ago

This Tracker item was closed automatically by the system. It was previously set to a Pending status, and the original submitter did not respond within 730 days (the time period specified by the administrator of this Tracker).

Original comment by: sf-robot

sbmlsecretary commented 14 years ago

Original comment by: sf-robot

sbmlsecretary commented 14 years ago

Reopening the item. Screw the damn bot.

Original comment by: mhucka

sbmlsecretary commented 14 years ago

Original comment by: mhucka

sbmlsecretary commented 14 years ago

Original comment by: fbergmann

sbmlsecretary commented 14 years ago

re-opened, nicolas just made me aware of this today

Original comment by: fbergmann

sbmlsecretary commented 13 years ago

Original comment by: mhucka

sbmlsecretary commented 12 years ago

Original comment by: luciansmith

sbmlsecretary commented 12 years ago

Original comment by: mhucka

sbmlsecretary commented 12 years ago

This fix needs to be made part of L3v2.

I've added it to the list of known errata, with an explanation.

Original comment by: mhucka

sbmlsecretary commented 11 years ago

Original comment by: luciansmith

sbmlsecretary commented 10 years ago

Original comment by: mhucka

sbmlsecretary commented 10 years ago

Original comment by: luciansmith

sbmlsecretary commented 10 years ago

Updating this to also include L2v4 so we remember (finally) to make the change for L2v5.

Original comment by: luciansmith

sbmlsecretary commented 10 years ago

Hello,

So .... Things are not as simple as it seemed. I went through all the proposals (Camille, Stephan etc.) and generated the graphs. I also explored all the alternatives I could find, including reification of statement and use of rdf attributes now deprecated (such as BagID). If we want to link independent triples through URIs, Camille's solution is not only the most elegant, it also provides the right graph. All the other solutions provide misleading graphs, i.e. with wrong statements. However, Camille's solution (as well as some of the other proposed) rely on using several Description elements. But SBML Level 2 requires that the controlled annotation lies in the first Description element. If we want to use only one Description element, the only way to get the proper graph is to nest the statements. By that I mean put a biomodels qualifiers and its content inside the Bag of another one. The advantage is that we do not need to introduce any other element or attribute. Also, all previous annotations remain valid. In summary, we have 3 options:

OPTION 1: We allow nested statements. The result is valid RDF, with the proper graph. We do not need any additional element or attribute. We still use only the first Description element for controlled annotation.

OPTION 2: We allow the use of several Description elements, and use them as described by Camille. The result is valid RDF, with the proper graph. We do not need any additional element or attribute. We do not restrict the controlled annotation to the first Description element anymore. However, we could say that if there are controlled annotations, the first Description element must be part of it (and then RDF finds the others)

OPTION 3: We add the rdf:ID attribute on a Bag as planned initially. The result is valid RDF, but the RDF graph is rubbish. The meaning is described in the specification (as we do for the alternative Bags at the moment). We still use only the first Description element for controlled annotation.

I am very very partial to OPTION 1.

Original comment by: lenov

sbmlsecretary commented 10 years ago

OPTION 1:

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/">
  <rdf:Description rdf:about="#_000004">
    <bqbiol:occursIn>
      <rdf:Bag>
        <rdf:li rdf:resource="http://identifiers.org/go/GO:0005764"/>
        <bqbiol:isDescribedBy>
          <rdf:Bag>
            <rdf:li rdf:resource="http://identifiers.org/pubmed/1111111"/>
          </rdf:Bag>
        </bqbiol:isDescribedBy>
        <bqbiol:isDescribedBy>
          <rdf:Bag>
            <rdf:li rdf:resource="http://identifiers.org/eco/ECO:0000004"/>
          </rdf:Bag>
        </bqbiol:isDescribedBy>
      </rdf:Bag>
    </bqbiol:occursIn>
  </rdf:Description>
</rdf:RDF>

Triples of the Data Model (Subject
Predicate Object)

http://www.w3.org/RDF/Validator/run/1399563933735#_000004 http://biomodels.net/biology-qualifiers/occursIn
genid:A920510

genid:A920510
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag

genid:A920510
http://www.w3.org/1999/02/22-rdf-syntax-ns#_1
http://identifiers.org/go/GO:0005764

genid:A920510
http://biomodels.net/biology-qualifiers/isDescribedBy
genid:A920511

genid:A920511
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag

genid:A920511
http://www.w3.org/1999/02/22-rdf-syntax-ns#_1
http://identifiers.org/pubmed/1111111

genid:A920510
http://biomodels.net/biology-qualifiers/isDescribedBy
genid:A920512

genid:A920512
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag

genid:A920512
http://www.w3.org/1999/02/22-rdf-syntax-ns#_1
http://identifiers.org/eco/ECO:0000004

Meaning:

SBML _000004 occursIn bag A920510 which contains go/GO:0005764

bag A920510 (proxy for the statement above) isDescribedBy bag A920511 which contains pubmed/1111111

bag A920510 (proxy for the statement above) isDescribedBy bag A920511 which contains eco/ECO:0000004

In SBML:

SBML _000004 occursIn compartment go/GO:0005764 as described by the paper pubmed/1111111 and evidence of type eco/ECO:0000004

Original comment by: lenov

sbmlsecretary commented 10 years ago

XML corresponding to the triples above (apparently XML declaration made the code disappear)

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:bqbiol="http://biomodels.net/biology-qualifiers/"     xmlns:bqmodel="http://biomodels.net/model-qualifiers/">
  <rdf:Description rdf:about="#_000004">
    <bqbiol:occursIn>
      <rdf:Bag>
        <rdf:li rdf:resource="http://identifiers.org/go/GO:0005764"/>
        <bqbiol:isDescribedBy>
          <rdf:Bag>
            <rdf:li rdf:resource="http://identifiers.org/pubmed/1111111"/>
          </rdf:Bag>
        </bqbiol:isDescribedBy>
        <bqbiol:isDescribedBy>
          <rdf:Bag>
            <rdf:li rdf:resource="http://identifiers.org/eco/ECO:0000004"/>
          </rdf:Bag>
        </bqbiol:isDescribedBy>
      </rdf:Bag>
    </bqbiol:occursIn>
  </rdf:Description>
</rdf:RDF>

Original comment by: lenov

sbmlsecretary commented 10 years ago

Graph of OPTION 2: (I give on the invisible XML)

http://www.w3.org/RDF/Validator/run/1399565682119#_000004 http://biomodels.net/biology-qualifiers/occursIn
genid:A920565

genid:A920565
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag

genid:A920565
http://www.w3.org/1999/02/22-rdf-syntax-ns#_1
http://identifiers.org/go/GO:0005764

http://identifiers.org/go/GO:0005764
http://biomodels.net/biology-qualifiers/isDescribedBy
genid:A920566

genid:A920566
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag

genid:A920566
http://www.w3.org/1999/02/22-rdf-syntax-ns#_1
http://identifiers.org/pubmed/1111111

http://identifiers.org/go/GO:0005764
http://biomodels.net/biology-qualifiers/isDescribedBy
genid:A920567

genid:A920567
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag

genid:A920567
http://www.w3.org/1999/02/22-rdf-syntax-ns#_2
http://identifiers.org/eco/ECO:0000004

Original comment by: lenov

sbmlsecretary commented 10 years ago

Nicolas, if you put spaces before the example XML, it will display correctly--I've edited your above post appropriately. (Also, it emails everyone OK.)

(Also, two spaces don't work--I had to go to four.)

Original comment by: luciansmith

sbmlsecretary commented 10 years ago

I agree with Nicolas, and option 1.

Original comment by: mhucka

sbmlsecretary commented 10 years ago

Original comment by: luciansmith

sbmlsecretary commented 10 years ago

Given Nicolas's investigation, and the fact that there are a few options available and not just one, I am re-setting this item to 'open' so it's in the right searches until we choose between them. Everyone has agreed that some change needs to be made, but there are a few different options.

Original comment by: luciansmith

sbmlsecretary commented 10 years ago

I would prefer a proper solution for annotations (blaming myself here), like an updated concept for how we would like to handle the linking of SBML into the semantic world (and how to solve the questions related to the SBML history at the same time).

However, if something needs to be changed now, then I am also voting for option 1.

Original comment by: dagwa

sbmlsecretary commented 10 years ago

While I think Dagmar's idea should definitely be explored more as we move forward, with two and a half votes, I wrote up what I understand to be Option 1 in SVN. If someone could look at

https://sourceforge.net/p/sbml/code/HEAD/tree/trunk/specifications/sbml-level-3/version-2/core/spec/sbml-level-3-version-2-core.pdf

and see if I described this option correctly, that would be great: section 6.3 has been updated to include "[NESTED CONTENT]", and there's a new example at the end of section 6.7 (I took Nicolas's example verbatim).

Original comment by: luciansmith

sbmlsecretary commented 10 years ago

(If we get one more editor vote and someone to sign off on the text, I'll add it to the L2v5 spec too, add it to the errata lists, and close this issue.)

Original comment by: luciansmith

sbmlsecretary commented 10 years ago

I prefer Option 2 myself, but would be ok with Option 1.

Original comment by: fbergmann

sbmlsecretary commented 10 years ago

I agree with option 1.

Regarding the text, maybe I missed it, but is it necessary to say that it is legitimate, when reading the controlled vocabulary, to ignore the nested elements?

Original comment by: bgoli

sbmlsecretary commented 10 years ago

That's an interesting question--I assume that on some level, all annotations can be ignored, right? No annotation actually changes the meaning of the thing it annotates, merely clarifies, or adds more specific information. I would assume the same is true of this nested annotation.

So, you're suggesting an explicit mention of this? That seems reasonable.

Original comment by: luciansmith

sbmlsecretary commented 10 years ago

Original comment by: luciansmith