Annotation count is reporting wrong numbers when mod.annotate()

Unaimend commented 2 months ago

Do you have a test case?

Porthmeus commented 2 months ago

Minimal example:

from src.MeMoMetabolite import MeMoMetabolite
from src.MeMoModel import MeMoModel

#this_directory = Path(__file__).parent
#dat = this_directory.joinpath("../manually_merged_models")
mod = cb.io.load_model("textbook")
mod = MeMoModel.fromModel(mod)
len(mod.metabolites)
sum([x._inchi_string == None for x in mod.metabolites])
mod.annotate()
sum([x._inchi_string == None for x in mod.metabolites])

Unaimend commented 2 months ago

What is the expected result of this test case? There are zero unannotated metabolites after calling annotate I do not see the problem here. Please specify the expected result in the test case in such way that it currently fails

Unaimend commented 2 months ago

I guess it should be sth like mod.annotate == len(mod.metabolites)

Unaimend commented 2 months ago

Currently we annotate more inchis then there are mtabolites.

Unaimend commented 2 months ago

ORIG DB BiGG
BiGG: Annotated inchis 0, annotated dbs 11, annotated names 11
There was an error during in RDkit (1)
There was an error during in RDkit (1)
ChEBI: Annotated inchis 72, annotated dbs 0, annotated names 0
ModelSEED: Annotated inchis 4, annotated dbs 72, annotated names 72
VMH: Annotated inchis 0, annotated dbs 0, annotated names 0
Total: Annotated inchis 76, annotated dbs 83, annotated names 83 

BiGG: Annotated inchis 0, annotated dbs 5, annotated names 5
ChEBI: Annotated inchis 0, annotated dbs 0, annotated names 0
ModelSEED: Annotated inchis 1, annotated dbs 2, annotated names 2
VMH: Annotated inchis 0, annotated dbs 0, annotated names 0
Total: Annotated inchis 77, annotated dbs 90, annotated names 90 

BiGG: Annotated inchis 0, annotated dbs 0, annotated names 0
ChEBI: Annotated inchis 0, annotated dbs 0, annotated names 0
ModelSEED: Annotated inchis 0, annotated dbs 0, annotated names 0
VMH: Annotated inchis 0, annotated dbs 0, annotated names 0
Total: Annotated inchis 77, annotated dbs 90, annotated names 90 

Annoatted Annotated inchis 77, annotated dbs 90, annotated names 90
Amount of unannotated inchis after Annotation 0
.
----------------------------------------------------------------------
Ran 1 test in 12.846s

So we do annotate 72 in the first run with chebi but hen we annotate some more

Porthmeus commented 2 months ago

But in general this looks really good, as it does exactly what we would expect - namely annotating the crossreferences sequentially. And the extra Inchi annotation might result from the fact, that ModelSeed has slightly more information in its inchis, thus a more complex inchi string, which would result in reannotation of the inchi and a count in that field.

Porthmeus / MeMoMe

Annotation count is reporting wrong numbers when mod.annotate() #132