monarch-initiative / ontogpt

LLM-based ontological extraction tools, including SPIRES
https://monarch-initiative.github.io/ontogpt/
BSD 3-Clause "New" or "Revised" License
574 stars 72 forks source link

Error using the BioLink model #22

Open vemonet opened 1 year ago

vemonet commented 1 year ago

Hi, I tried to use the BioLink model as base to run the SPIRES engine (as it was recommended in the readme!), but faced some issue.

Generating the .py file from the biolink_model.yaml put in src/ontogpt/templates/ worked well with make

But we encounter an error when trying to extract a class:

poetry run ontogpt extract -t biolink_model.ChemicalToDiseaseOrPhenotypicFeatureAssociation treatment.txt

We are getting the error ValueError: Template biolink_model.ChemicalToDiseaseOrPhenotypicFeatureAssociation not found because the classes name defined in the BioLink model are in the format chemical to disease or phenotypic feature association

If we try to provide the class name with spaces:

poetry run ontogpt -v extract -t "biolink_model.chemical to disease or phenotypic feature association" treatment.txt

There is an error due to the python classes name not matching:

  File "/home/vemonet/develop/translator/ontogpt/src/ontogpt/engines/knowledge_engine.py", line 230, in _get_template_class
    self.template_pyclass = mod.__dict__[class_name]
KeyError: 'chemical to disease or phenotypic feature association'

I could easily implement a fix I think, but I wonder if anyone here with a better knowledge of LinkML than me has a quick fix for this problem? @cmungall

cmungall commented 1 year ago

I think it should be a quick fix to allow the non-conventional names used in biolink (all other linkml schemas have switched to camelcase + snake case)

however, there will likely be some degree of customization required such as prompt hints, describing preferred annotators, etc. Ideally these would be in a separate configuration file to have separation of concerns but for now it's necessary to do a bit of copy and pasting to get the desired behavior. this should be something we can fix with the linkml-transformer framework!

nlharris commented 9 months ago

Just wondering if this has been fixed yet.

caufieldjh commented 9 months ago

No, not yet - the workaround is to replace the offending class name with something equivalent to the Python class, like chemical to disease or phenotypic feature association -> ChemicalToDiseaseOrPhenotypicFeatureAssociation. But ideally that shouldn't be necessary.