vLLM Provider - Githubissues

PeytonCleveland commented 4 months ago

Feature Description

Overview

The AI SDK supports numerous commercial providers, such as OpenAI, Anthropic, etc. However, there are many instances where a self-hosted inference server is more appropriate or necessary due to security or data privacy concerns. vLLM is a popular choice for use cases, and SDK support would greatly simplify the creation of generative AI applications making use of RSCs and self-hosted inference servers.

vLLM OpenAI-compatible API server

vLLM exposes an OpenAI-compatible API, meaning implementation here should look very similar to the existing OpenAI provider: vLLM Docs

Use Case

Self Hosted Models

SDK users wishing to integrate a self hosted inference server into their application would benefit greatly from support for vLLM. There are a large number of domains that may wish to make use of generative AI, such as government, military, healthcare, financial, etc., but have strict requirements around protecting data and thus necessitate the use of self-hosted solutions.

RSCs + GenAI

RSCs and the AI SDK can greatly simplify and reduce the amount of effort needed to build generative AI applications. However, without support for self-hosted models the number of teams able to make use of this SDK will be limited. Adding this support makes the AI SDK an ideal solution for those using Next, RSCs, and self hosted infrastructure.

Additional context

Background

I've implemented generative AI features in a number of applications within the public sector. Currently, getting all the pieces working well together is a major headache and architecture can get quite convoluted. All my projects use Next, and while there are a number of great projects out there like Langchain and LlamaIndex, the JS versions always lag behind the Python versions and are missing support for things like vLLM. This project seems like an ideal solution for those using Next and exactly what I've been wishing for, just need to support vLLM to be able to use it 😄

lgrammel commented 4 months ago

If vLLM is OpenAI-compatible, you can use the OpenAI provider with custom settings: https://sdk.vercel.ai/providers/ai-sdk-providers/openai#provider-instance

K-Mistele commented 2 months ago

The vLLM Chat completions API is now more strictly OpenAI-compatible, including tool calls and tool streaming per vllm-project/vllm#8272

I detailed how to use createOpenAI to get everything set up in that issue, so it should work out-of-the-box for you if you're using Hermes or Mistral models @PeytonCleveland

vercel / ai

vLLM Provider #2231