Using local models that follow the OpenAI API.

zou-group / textgrad

Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.

http://textgrad.com/

MIT License

944 stars 67 forks source link

Using local models that follow the OpenAI API. #20

Closed kgourgou closed 1 week ago

kgourgou commented 1 week ago

Hi all,

Thanks for the great library. I wanted to be able to play with local models, and I understand you want to keep your library dependencies clean.

So instead, I just made a copy of your ChatOpenAI class, and I'm passing a client by hand. The client can be initialised with the OpenAI API pointing to a local server that is created via LM Studio (or any other way really, as long as the OpenAI API is followed), so no additional (in-library) dependencies needed.

That has already been useful for my experiments, but I thought I would share here too.

vinid commented 1 week ago

Hello and thanks so much for this!!

Do you think we could achieve a similar result by having an env variable that sets the HOST for the OpenAI Client?

So that you could switch to LMStudio, AnyScale, or Predibase with one simple update of an env variable? (this is true if my understanding - LLMStudio supports OpenAI requests format - is correct)

kgourgou commented 1 week ago

Of course! The user could do this instead, e.g.,

  import os 
  os.environ['OPENAI_API_KEY'] = "lm-studio"
  os.environ['OPENAI_BASE_URL'] = "http://localhost:1234/v1"
  client = OpenAI()
  engine = ChatExternalClient(client, model_string='mlabonne/NeuralBeagle14-7B-GGUF')
  print(engine.generate(max_tokens=40, prompt="What is the meaning of life?"))

works fine.

Or even simpler, without worrying about the external client.

  import os 
  os.environ['OPENAI_API_KEY'] = "lm-studio"
  os.environ['OPENAI_BASE_URL'] = "http://localhost:1234/v1"
  engine = ChatOpenAI(model_string='mlabonne/NeuralBeagle14-7B-GGUF')
  print(engine.generate(max_tokens=40, prompt="What is the meaning of life?"))

If one does not have access to LM-Studio or equivalents and is accessing GPT from an environment that does not expose fixed API keys, e.g., in business, then passing an external client may still be useful (in fact that's how I'm using TextGrad on my work network).

kgourgou commented 1 week ago

Removed duplicated code from the small class by inheriting from ChatOpenAI. Only thing I don't like about it is that it still using the "cache_openai_{model_string}.db" name for the cache.

vinid commented 1 week ago

I think that is fine, we should probably fix this in the upper class to have a better handling of caching engines. If this looks good to you I can merge!

kgourgou commented 1 week ago

Cheers, sounds good to me. I just added a bit more documentation to the class.

Looks like I can't merge it myself.

vinid commented 1 week ago

yea main is blocked, doing it now. Thanks of the contribution!

kgourgou commented 1 week ago

Thanks for checking!