draeger-lab / ModelPolisher

ModelPolisher accesses the BiGG Models knowledgebase to annotate SBML models.
MIT License
23 stars 7 forks source link

Allow for species/reactions/... to be processed directly #78

Open mephenor opened 4 years ago

mephenor commented 4 years ago

As @matthiaskoenig mentioned during GSoC, it would be nice to restructure ModelPolisher so it can directly annotate model parts without the necessity to parse the complete model. This would also allow to test annotation of these components properly and thoroughly. The first question here is, whether JSBML is able to parse only these parts of a model or if we need to we need to write custom code for this - I haven't had a look at JSBML with regard to this yet. We would also need to specify which format is consumed, e.g. an SBML snippet with only the respective component.

draeger commented 4 years ago

This could indeed be an interesting feature. However, the key aspect here is not the parsing of a file. It is rather to provide a part of a model to ModelPolisher that is then annotated. To this end, it is sufficient to pass some JSBML data structures to ModelPolisher. Those data structures can come from a larger model, parts of a file, or from other sources. All we need are API functions that can be called on individual model components.

mephenor commented 4 years ago

I moved some of the annotation logic from BiGGAnnotation into separate classes, so Species, Reaction, GeneProduct and Compartment all have their respective Annotation class and could theoretically be annotated directly using those classes. There are, however, some places where annotation is dependent on the model id, so this code should be adapted accordingly. Additionally I need to do something similar for the polishing.

The easiest way right now to implement this would be by using the corresponding JSON representation. As the classes are already present for parsing JSON models, parsing input should be a matter of simply giving Jackson the information to interpret the data as the correct type. Output could then be produced as JSON which should be straightforward for the most part, except for information that is not directly present on the Species/Reaction/etc. itself and thus will need some further work in those places.