Supporting local LLM api server and vLLM

iryna-kondr / scikit-llm

Seamlessly integrate LLMs into scikit-learn.

https://beastbyte.ai/

MIT License

2.98k stars 235 forks source link

Supporting local LLM api server and vLLM #51

Closed authurlord closed 1 month ago

authurlord commented 10 months ago

Thanks for your great work! Since https://github.com/lm-sys/FastChat can initiate a local server on llama2/vicuna, which api is quite similar to openai, it is possible to support FastChat api server, so we can inference with a local api server?

Besides, is there any plan to support batch inference with https://github.com/vllm-project/vllm? The tabular data examples are similar, so batch inference with vLLM could speed up the whole process than gpt4all

OKUA1 commented 10 months ago

Hi @authurlord,

Thank you for your suggestion. To be honest I am not too familiar with FastChat, so will have to investigate it further. Regarding vLLM, most likely it will be supported in some form eventually, but so far we did not do any development in this direction.

bacoco commented 8 months ago

Just add the possibility to change the openai_base and it will work

OKUA1 commented 7 months ago

@bacoco yes, this is the most straight-forward solution which we will do for sure.

We were thinking about some sort of deeper integration though, but did not make much progress yet.

iryna-kondr commented 1 month ago

Resolved with https://github.com/iryna-kondr/scikit-llm/pull/94