SBRG / bigg_models

The BiGG Models website server
http://bigg.ucsd.edu
Other
77 stars 18 forks source link

listOfGroups missing in all models #362

Open Ghabry opened 4 years ago

Ghabry commented 4 years ago

Hello,

one or two years ago I downloaded the model for iJN678 and it contained the pathways as a "listOfGroups" with type "partonomy". Today I downloaded the model again and was confused because all the group information is lost.

I used your API to download all other models and there are also no groups. What happened with the pathway annotations? Is there a reason why all the pathways are gone?

Btw, is there a history function to view old SBML files? I checked https://github.com/SBRG/bigg_models_data to get an older version but the XML from iJN678 available there is very old and still SBMLv2 :/.

draeger commented 4 years ago

The information is not lost. BiGG Models database still contains it. Right now it is, however, not included in the SBML export. To solve the problem, we are working towards a new version of the model annotation tool ModelPolisher used by BiGG to generate SBML files.

lcottret commented 4 years ago

Hi,

If the pathway information is still in the database, does it exist a way to get it ? I didn't find a way to get pathway information via the API...

draeger commented 4 years ago

It is possible to download a model from BiGG and to run ModelPolisher yourself. The application is shipped as a Docker file containing the public release of BiGG Models. Afterwards, the SBML file will contain that missing information.

We would be very happy to help the BiGG team integrating ModelPolisher into their curation pipeline again, so that the provided SBML files could directly contain this and further information.

lcottret commented 4 years ago

Thanks, I'm going to try ModelPolisher !

lcottret commented 4 years ago

I used ModelPolisher. It's just wonderful to get all identifiers.org links !!! But unfortunately, the subsystem information is still lacking (no listOfGroups in the SBML)...

mephenor commented 4 years ago

Which options did you use when running ModelPolisher? I get 52 groups in one ListOfGroups for this model, so that is strange

lcottret commented 4 years ago

Hmm, it's another model that we build by ourselves but with BIGG namespaces. But indeed it works with models directly downloaded from BIGG. Does it mean that the pathway information can be retrieved only for BIGG models ?

mephenor commented 4 years ago

I had a look at the code and BiGG. Subsystem information is currently retrieved based on model and reaction BiGGId, as the information is stored in the model_reaction table. I could add some code to retrieve groups based on reaction id alone, however this would likely require manual validation afterwards. Even if correcting for discrepancies in case, still about 7.5% of reaction ids resolve to multiple subsystems, so subsystem information should only be added this way, if it can be resolved uniquely.

For the remaining non unique subsystems we get things like [('Keratan sulfate biosynthesis',), ('KERATAN SULFATE SYNTHESIS',)] or [('Nucleotide interconversion',), ('NUCLEOTIDE INTERCONVERSION',), ('Nucleotides',)] or [('Lipopolysaccharide Biosynthesis Recycling',), ('Lipopolysaccharide Biosynthesis / Recycling',)], i.e. information that could be rectified in BiGG by using one naming variant. Even then, things like [('Amino Acid Metabolism',), ('S_Complex_Alcohol_Metabolism',)] remain, where the correct subsystem still depends on the actual model.

lcottret commented 4 years ago

Thanks, it could be great to have the option to get subsystems from the reaction ids even if it will remain some manual curation afterwards to remove redundancies or pathways not related to the model.