singa-bio / singa

:leaves: SiNGA (Simulation of Natural Systems using Graph Automata) is an open-source library containing tools especially for structural bioinformatics and systems biology.
MIT License
8 stars 2 forks source link

add possibility to import chemical entities referenced in sbml files #3

Closed cleberecht closed 8 years ago

cleberecht commented 8 years ago

I propose we have a look at LibSBML to parse sbml files. Most chemical entities are in sbml files are referenced using ChEBI identifiers and therefore parsable using the ChEBIParserService.

This is also important for issue #2.

cleberecht commented 8 years ago

Implemented a stub of SBML parsing with libSBML.

Problem: some parts of annotations are ambiguous such as:

<bqbiol:hasPart>
   <rdf:Bag>
      <rdf:li rdf:resource="http://identifiers.org/uniprot/P08839"/>
      <rdf:li rdf:resource="http://identifiers.org/kegg.compound/C00074"/>
   </rdf:Bag>
</bqbiol:hasPart>
<bqbiol:hasPart>
   <rdf:Bag>
      <rdf:li rdf:resource="http://identifiers.org/chebi/CHEBI:18021"/>
      <rdf:li rdf:resource="http://identifiers.org/uniprot/P08839"/>
      <rdf:li rdf:resource="http://identifiers.org/kegg.compound/C00615"/>
   </rdf:Bag>
</bqbiol:hasPart>

I am currently thinking we should parse all that is possible and if two identical compounds can be added (P08839 is in both bags) only one is actually used and it seems that if we parse either chebi or kegg everything should be okay (fingers crossed).

Maybe give the user the possibility to decide this in an guided gui component.

cleberecht commented 8 years ago

Problem:

<bqbiol:is>
   <rdf:Bag>
      <rdf:li rdf:resource="http://identifiers.org/uniprot/P69786"/>
      <rdf:li rdf:resource="http://identifiers.org/uniprot/P69783"/>
   </rdf:Bag>
</bqbiol:is>

"is" qualifiers should probably not be allowed to have two clearly no equal annotations. Does this mean this is a complex of both (probably is in this case)? But in the same model (BIOMD0000000038) is a annotation (for the species PEP) that signals an alternative from two different databases.

cleberecht commented 8 years ago

Added a functional parser for species in smbl files using libsbml. This might need some improvement but should work regardless. Additionally added a service that is able to pull biomodels from the database.