support text-generation-task models from Huggigface

zmedelis / bosquet

Tooling to build LLM applications: prompt templating and composition, agents, LLM memory, and other instruments for builders of AI applications.

https://zmedelis.github.io/bosquet/

Eclipse Public License 1.0

280 stars 19 forks source link

support text-generation-task models from Huggigface #16

Closed behrica closed 11 months ago

behrica commented 1 year ago

Docu is here:

https://huggingface.co/docs/api-inference/detailed_parameters#text-generation-task

This 'task' seem to match bosques concept of "completion" best. If I understand hugging fcase ocrrectly, this would then allow to use > 14000 models from bosquet: https://huggingface.co/models?pipeline_tag=text-generation

behrica commented 1 year ago

This is adding a simple http/get call. Ones PR #15 is merged and if this gets addressed: https://github.com/zmedelis/bosquet/issues/8#issuecomment-1605397954 (I will do an other PR when PR #15 is in) I can do a PR for adding huggingface models support

zmedelis commented 1 year ago

Please do, having HF would be great!

zmedelis commented 1 year ago

HF model support should be done via Text Generation Inference (TGI)

groundedsage commented 1 year ago

+1 for Huggingface models support

behrica commented 1 year ago

I think that huggingFace supports 2 APIs now. The classical one is the "Hosted inference API" documented here: https://huggingface.co/docs/api-inference/index#hosted-inference-api

There is as well the TGI, documented here: https://huggingface.co/docs/text-generation-inference/index

I could not see that TGI is hosted at HuggingFace. They document self-hosting, via Docker for example. While the "inference API" is hosted here: (rate limited in free version): https://api-inference.huggingface.co/models/bert-base-uncased

Invertisment commented 1 year ago

I think that huggingFace supports 2 APIs now

What about later? Maybe they'll have 5 APIs...?

having HF would be great!

Hardcoding models and APIs is not great. I think the better way to approach this is to not load API keys from files (I suggest to delete/move basically one fifth of this library). See issue #38

https://github.com/zmedelis/bosquet#llmops-for-large-language-model-based-applications

Basically: Remove Integrant and aero from the main library scope; think about how to support different models in different steps of the Role, Synapsis and Review prompts. For instance a different model could be used during the Review prompt.

I think this library has too many concerns. Loading, generating, evaluating, generating again, maybe even retrying the generation. I think that loading step is not required.

zmedelis commented 11 months ago

With the latest commits 'LM studio' hosted HF models are supported. See https://zmedelis.github.io/bosquet/notebook/using_llms/index.html. Also, the 'Mixing models' section at the bottom shows how different models can be used for different steps.