Open alexzfan opened 9 months ago
@alexzfan I'm okay with the optional max_token validation.
I haven't heard of vLLM before. What kind of integration were you thinking of?
@andreibondarev I was thinking a wrapper around the OpenAI class that's currently in the library. It seems like the python package does this as well https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/vllm.py
vLLM offers an OpenAI-like inference server endpoint that allows it to be dropped into applications that currently use the OpenAI Protocol. This has been integrated into the python langchain package (https://python.langchain.com/docs/integrations/llms/vllm), but not the ruby gem. Hacking around it by using the base OpenAI constructor such as below leads to token length/limit validation errors
Is there any plans to integrate this fully or has anyone been able to find a different solution to this?