HugginFace API + UnboundLocalError: local variable 'results' referenced before assignment

monarch-initiative / ontogpt

LLM-based ontological extraction tools, including SPIRES

BSD 3-Clause "New" or "Revised" License

548 stars 68 forks source link

I am presenting these two issues as one, but I don't know if they are related.

I am interested in using OntoGPT with a HuggingFace model remotely i.e. through the API, without downloading the model locally, not even with gpt4all. This seems possible: I could not find an example of someone who has done it, but issues #145 and #146 seem to state it pretty clearly. I understand that implementation is limited to only some models. I pip-installed ontogpt and ontogpt[huggingface] and set my API key with the command "runoak set-apikey -e hfhub-key".

When I used "ontogpt list-models", all models listed showed either GPT4ALL or OPENAI as the provider, while I expected some to use HUGGINGFACEHUB. So, I assumed that perhaps all gpt4all models were also okay to use through the Hub, and I tried to use of those by running the command 'ontogpt complete exemple.txt -m MISTRAL_7B_OPENORCA". However, I got the following error message:

UnboundLocalError: cannot access local variable 'results' where it is not associated with a value.

I also tried to run it on Google Collab and got the following, similar error:

UnboundLocalError: local variable 'results' referenced before assignment

Finally, I tried to do the same but without specifying my HuggingFace key and still got that error, so that makes me think I am not actually accessing the Hub. However, I cannot really find information on how to use the HunggingFace API.

Hi @SimoneAstarita, Support for accessing HuggingFace models through their Hub API is currently deprecated - that's mostly because I found it to be very slow. That being said, I'm rewriting OntoGPT's backend to use the litellm package, and that should permit using HuggingFace Hub easily.

In the meantime, you may try something like this:

Install litellm - https://github.com/BerriAI/litellm/ - see also https://litellm.vercel.app/docs/providers/huggingface
Edit your extra-openai-models.yaml as detailed in the llm docs, or create it if it doesn't exist. The llm package is a dependency of ontogpt, but the extra-openai-models.yaml file location will need to be in a system directory (on my linux system it's in ~/.config/io.datasette.llm/extra-openai-models.yaml). Add to that file something like this, replacing values with your chosen model and inference endpoint:

- model_name: "huggingface/WizardLM/WizardCoder-Python-34B-V1.0"
  model_id: wizardlm
  api_base: "https://my-endpoint.huggingface.cloud"

Modify one of the entries in models.yaml in your local ontogpt install (https://github.com/monarch-initiative/ontogpt/blob/main/src/ontogpt/models.yaml) to have the same canonical-name as what llm has for its model_id, e.g., wizardlm in this example.
Run an OntoGPT extraction, passing the name of the modified entry or any of its alternative names to the -m option

I think that should work with both their free and paid API, but OntoGPT is unlikely to work with all models. More details here: https://huggingface.co/docs/api-inference/en/index

monarch-initiative / ontogpt

HugginFace API + UnboundLocalError: local variable 'results' referenced before assignment #404