add Replicate API option

benswift commented 9 months ago

Even though it's just OpenAI for now the code is nice and modular and obviously extensible to other hosted LLM providers (🙌🏻).

I'm not sure if there's a roadmap somewhere that I've missed, but Replicate might be a good option for the next "platform" to be added. It's one place that Meta are putting up their various LLama models. However, I think it'd only support the LangChain.Message stuff - there's no function call support in the models as yet.

I'd be open to putting together a PR to add replicate support (their official Elixir client lib uses httpoison, so I guess it'd be better to just call the Replicate API directly using Req).

Would you be interested in accepting it? Happy to discuss implementation strategies, because I know the move from single -> multiple platform options introduces some decisions & tradeoffs.

brainlid commented 9 months ago

Good question! My short-list right now is to focus on the LLMs. I want to add support for Llama 2 and Bard.

My next focus, (we'll see how it goes), is to add local Llama 2 support through Nx/Bumblebee.

I'm not opposed to Replicate support. I'm not familiar with their API either.

Also, I'd like to upgrade to the latest Req which changes the internal API for streaming responses back.

warnero commented 9 months ago

I'm trying to go into doing some agent stuff, so I'm digging into the text processing and that side of things for the moment.

benswift commented 9 months ago

Hey @brainlid

Good question! My short-list right now is to focus on the LLMs. I want to add support for Llama 2 and Bard.

Yep, sounds great. I guess the challenge from a lib design perspective is that there's a difference between models and (hosting) platforms - although it's a challenge that's currently masked by the fact that OpenAI is sortof both.

I think I favour they way you've got it currently, organising the code around platforms (because that's what determines the API interface code). Then, between-models-on-the-same-platform differences can be handled within each platform's module (e.g. ChatOpenAI.new! handling things differently based on the provided :model key).

My next focus, (we'll see how it goes), is to add local Llama 2 support through Nx/Bumblebee.

Yep, agreed, Bumblebee support is a great way to go.

I'm not opposed to Replicate support. I'm not familiar with their API either.

It's pretty standard; from a user (of this lib) perspective you could set up LangChain.ChatModels.ChatReplicate so that it works just the same as LangChain.ChatModels.ChatOpenAI. You'd just have to limit the supported :models to the Replicate models where it made sense (e.g. Llama2 13B chat or even the mistral ones that went up yesterday).

Also, I'd like to upgrade to the latest Req which changes the internal API for streaming responses back.

Also 👍🏻

Anyway, I know you don't need me internet quarterbacking this whole thing, and I'm sure you're aware of all the above challenges. Just wanted to see what the plans were so that if there was an overlap between the way you wanted to take things and my ability to contribute 😄

brainlid commented 9 months ago

from a user (of this lib) perspective you could set up LangChain.ChatModels.ChatReplicate so that it works just the same as LangChain.ChatModels.ChatOpenAI. You'd just have to limit the supported :models to the Replicate models where it made sense (e.g. Llama2 13B chat or even the mistral ones that went up yesterday).

Yes, that's the idea. That one module, plus the protocols, are used to adapt a specific service like Replicate to the rest of the library. That way nothing else in the library needs to know about how different services work or what they support.

Thanks for the Replicate API docs link.

Just wanted to see what the plans were so that if there was an overlap between the way you wanted to take things and my ability to contribute 😄

So is there an overlap? :slightly_smiling_face:

benswift commented 9 months ago

So is there an overlap? 🙂

Yep, there is 😉 I'm on parental leave atm so finding time is a bit tricky (it might be a week or two) but it's a nice bite-sized chunk of work that I'd be happy to contribute.

benswift commented 8 months ago

Hey @brainlid , have a look at the Replicate stuff I pushed up here. tl;dr is that it works for a limited subset of features (no streaming, no functions) for now. My replicate branch includes basic tests for the new replicate stuff, although I haven't obsessed about covering all the corner cases (because the origin/main test suite isn't green atm either - I figured the project is still moving pretty fast).

brainlid / langchain

add Replicate API option #4