ray-project / ray-llm

RayLLM - LLMs on Ray
https://aviary.anyscale.com
Apache License 2.0
1.22k stars 89 forks source link

Request for Comment: Aviary <-> LangChain Integration #4

Closed waleedkadous closed 1 year ago

waleedkadous commented 1 year ago

Purpose

Aviary is an open source LLM management toolkit built on top of Ray Serve and Ray. LangChain is an incredibly popular open source toolkit for building LLM applications. The question is how do these things fit together?

Possible integrations

This first integration is focused on LLM (non chat) to begin with. When we add streaming to Aviary, we will also integrate that for the chat application too.

There are 3 possible integration points with LangChain.

1. Aviary as an LLM provider for LangChain

Make Aviary a model backend for LangChain (the same way that OpenAI is done currently).

This would enable you to do things like:

import os
from langchain.llms import Aviary
aviary_url = os.environ["AVIARY_URL"]
# Token is optional
aviary_token = os.environ["AVIARY_TOKEN"] or None

from langchain.llms import Aviary

llm = Aviary(model_name = 'amazon/LightGPT')

#single query
llm.predict('How do you make fried rice?')

#uses Aviary's batch interface for greater efficiency
llm.generate(['How do you make fried rice?', 'What are the most influential punk bands?'])

The only real decision here is do we use our SDK or do we allow direct connection to our endpoints. Since our Web API right now is so simple, it might be easier to code against it in the short term, and use the SDK when it is justified.

2. Aviary “wraps” LLMs provided by LangChain

Allow any model supported by LangChain to be wired up through Aviary (the same way that Aviary currently “wires up” Hugging Face). This would give a way for centrally managed Aviaries to control access to models from OpenAI and to impose additional limits on length.

For every model you want to wrap, you would have to set up a models/ file in the https://github.com/ray-project/aviary/tree/master/models directory. We would expand that file format to also support LangChain LLMs as well.

3. Integrate LangChain LLM support directly into Aviary Explorer and Aviary CLI

Allow users to query any model supported by LangChain directly. This would be useful for example to do cross OSS <-> commercial comparisons e.g. with GPT-3.5-turbo.

What we would do there is allow Aviary CLI to do something like this:

aviary query -–model amazon/LightGPT -–model model-configs/langchain-openai-gpt-35,yaml examples/qa-prompts.txt 

In the aviary command, we would read openai://gpt-3.5-turbo and use the LangChain OpenAI LLM tool allowing for cross evaluation.

We would have add new functionality to Aviary Explorer to support adding arbitrarily configured LangChain LLMs.

In essence the difference between proposals 2 and 3 is: where do the config files for specifying LLM properties live?

Decision

We are not limited to doing one of these.

The most immediate need and highest impact is perhaps #1.

2 and #3 are similar in many ways. Perhaps #3 is more impactful. The Aviary Explorer changes, however, are more complicated. It’s slightly ugly in the sense that we now have yaml files both on the Aviary backend and in the Aviary CLI and Explorer.

lpfhs commented 1 year ago

The approach FastChat has taken for LangChain integration is to provide an OpenAI-compatible API for models hosted on FastChat. I don't know if that's a good approach since it does introduce an extra layer of API, but it avoids having to add support for Aviary in LangChain. You also get ready support for Aviary models in any applications currently using the OpenAI API or LangChain.

waleedkadous commented 1 year ago

Great suggestion!

We looked at this. This turned out to be too limiting (e.g. Aviary has support for optimized batching, whereas OpenAI's GPT-3.5-Turbo interface does not).

But we are planning on supporting the OpenAI wire format with our endpoints. We just need some time.

hwchase17 commented 1 year ago

@waleedkadous agree that #1 seems the easiest/highest priority. happy to help with that in any way

waleedkadous commented 1 year ago

PR for option 1 at https://github.com/hwchase17/langchain/pull/5661 (@hwchase17 jfyi). Option 3 merged at https://github.com/ray-project/aviary/commit/8e4e965bb19e7944f9687d9b89b4e47d4aa069d0

XBeg9 commented 9 months ago

Just a sample like this

from langchain.llms import Aviary

llm = Aviary(model='TheBloke/Llama-2-70B-chat-AWQ', aviary_url="http://localhost:8000/v1", aviary_token="EMPTY")
output = llm('How do you make fried rice?')

Gives me an error http://localhost:8000/v1 does not support model TheBloke/Llama-2-70B-chat-AWQ. (type=value_error) any ideas what I am doing wrong? @waleedkadous