Closed behrica closed 11 months ago
This is adding a simple http/get call. Ones PR #15 is merged and if this gets addressed: https://github.com/zmedelis/bosquet/issues/8#issuecomment-1605397954 (I will do an other PR when PR #15 is in) I can do a PR for adding huggingface models support
Please do, having HF would be great!
HF model support should be done via Text Generation Inference (TGI)
+1 for Huggingface models support
I think that huggingFace supports 2 APIs now. The classical one is the "Hosted inference API" documented here: https://huggingface.co/docs/api-inference/index#hosted-inference-api
There is as well the TGI, documented here: https://huggingface.co/docs/text-generation-inference/index
I could not see that TGI is hosted at HuggingFace. They document self-hosting, via Docker for example. While the "inference API" is hosted here: (rate limited in free version): https://api-inference.huggingface.co/models/bert-base-uncased
I think that huggingFace supports 2 APIs now
What about later? Maybe they'll have 5 APIs...?
having HF would be great!
Hardcoding models and APIs is not great. I think the better way to approach this is to not load API keys from files (I suggest to delete/move basically one fifth of this library). See issue #38
https://github.com/zmedelis/bosquet#llmops-for-large-language-model-based-applications
Basically: Remove Integrant and aero from the main library scope; think about how to support different models in different steps of the Role, Synapsis and Review prompts. For instance a different model could be used during the Review prompt.
I think this library has too many concerns. Loading, generating, evaluating, generating again, maybe even retrying the generation. I think that loading step is not required.
With the latest commits 'LM studio' hosted HF models are supported. See https://zmedelis.github.io/bosquet/notebook/using_llms/index.html. Also, the 'Mixing models' section at the bottom shows how different models can be used for different steps.
Docu is here:
https://huggingface.co/docs/api-inference/detailed_parameters#text-generation-task
This 'task' seem to match bosques concept of "completion" best. If I understand hugging fcase ocrrectly, this would then allow to use > 14000 models from bosquet: https://huggingface.co/models?pipeline_tag=text-generation