LM Studio's Local Inference Server support request

machinewrapped / gpt-subtrans

Open Source project using LLMs to translate SRT subtitles

Other

311 stars 36 forks source link

LM Studio's Local Inference Server support request #103

Closed likangkao closed 3 months ago

likangkao commented 5 months ago

Please add LM Studio's Local Inference Server support, so we can use our own computer to simulate the same function of GPT in order to save money. Thanks! You may get some idea from this link: https://www.youtube.com/watch?v=IgcBuXFE6QE

machinewrapped commented 5 months ago

Hi, I've started a branch to extend the program to support different providers.. I'm using OpenAI Azure as a test case, but the plan is to then support https://github.com/BerriAI/litellm, which will mean you can use any of the endpoints they support - including locally hosted models.

It's quite a big job, I was hoping to make progress over Christmas but I didn't get very far so it might be a while before it's ready.

It will be interesting to see how good any of the models that are small enough to run locally are for translation - one of the reasons GPT is so good at translation is that it knows a lot of cultural context so it can translate idiomatic language quite well. I think the smaller models are more focused on general reasoning, so they might be less able to translate semantics. Definitely worth finding out though!

likangkao commented 5 months ago

Thanks for your effort. Do you have any plan to add Google gemini pro API support?

machinewrapped commented 5 months ago

Yes, definitely - translation is the one area in which even Bard with PALM2 is competitive with ChatGPT, so Gemini should be a good fit.

likangkao commented 5 months ago

Nice!

xpufx commented 4 months ago

Doesn't the configurable openai base url already enable using other opeanai compatible APIs? (lm studio, koboldcpp, text-generation-webui ec). I was meaning to test this but the models I have access to are really bad at translation. (7b cpu inference)

xpufx commented 3 months ago

python gpt-subtrans.py --apibase http://10.0.10.165:5000/v1 ....

INFO: Translating with OpenAI model gpt-3.5-turbo-0125, Using API Base: http://10.0.10.165:5000/v1
INFO: Translating 869 lines in 6 scenes
INFO: HTTP Request: POST http://10.0.10.165:5000/v1/chat/completions "HTTP/1.1 200 OK"

This seems to work. The url is text-generation-webui's openAI compatible API server which is enabled with the --api flag. (I didn't check to see if it actually works and produces a translation file or whether the quality of said translation is acceptable. I don't have a good model for it.)

machinewrapped commented 3 months ago

Great - I'll add a note to the documentation.

machinewrapped commented 3 months ago

Finally added native support for LM Studio, or any service with an OpenAI compatible API. It's not much different from using OpenAI with a custom endpoint but it is a bit more configurable, and doesn't require specifying an API key or model name.

https://github.com/machinewrapped/gpt-subtrans/releases/tag/v0.6.7

My experiences so far have not been encouraging. I've tried a number of models that can run on my RTX 3080 and found they tend to get confused and keep generating spurious tokens indefinitely. Keeping the batch size very small (around 10 lines) seems to be key, but even then it's very hit or miss whether the response is usable. Maybe somebody with more experience working with these models can suggest a prompt strategy to encourage them to behave.

Let me know in the discussions if you find a setup that works well! https://github.com/machinewrapped/gpt-subtrans/discussions/161