draeger-lab / ModelPolisher

ModelPolisher accesses the BiGG Models knowledgebase to annotate SBML models.
MIT License
23 stars 7 forks source link

GPR creation fails with ClassCastException #88

Closed mephenor closed 4 years ago

mephenor commented 4 years ago

For some longer gene reaction rules parsing fails. Could not find a pattern yet, except that in most cases a fbc.AND is tried to be interpreted as fbc.GeneProductRef. Needs some additional investigation.

Edit: Actually, this might only affect those fetched from BiGGDB and not already present in the model, might have a lead here. Edit2: Seem to be reaction_rules where the top level expression is encased in parentheses.

mephenor commented 4 years ago

Except for some very large gene reaction rules, this should work now.

draeger commented 4 years ago

@mephenor So we'll keep this open until fully solved.

mephenor commented 4 years ago

For iMM1415 from bigg_models_data one of the gene reaction rules is abnormally long (~ a few dozen lines) and causes a stack overflow in the fbc GPRParser. The other remaining issue apparently concerns logical operations with an empty string inside the gene reaction rule Could not parse '(319848) or (105355) or or (20504)' because org.sbml.jsbml.text.parser.ParseException: Encountered " <BOOLEAN_LOGIC> "or "" at line 1, column 26, missing logical operators Could not parse '(ECDH10B_4548 or ECDH10B_0600 ECDH10B_0708)' because org.sbml.jsbml.text.parser.ParseException: Encountered " <STRING> "ECDH10B_0708 "" at line 1, column 31 and ? in the rule Could not parse '2632?' because org.sbml.jsbml.text.parser.TokenMgrError: Lexical error at line 1, column 6. Encountered: <EOF> after : "". For both those issues I am currently checking if this is an issue with the query or the content in BiGG, with the latter being far more likely, as this only seems to affect this specific model.

mephenor commented 4 years ago

Realized this happens during the polishing and not annotation phase, i.e. when trying to parse GPRs from notes in the model. Thus the wrong information already is in the file and this can be closed, as logging the warnings/errors is correct behavior at this point.