SubtitleEdit / subtitleedit

the subtitle editor :)
http://www.nikse.dk/SubtitleEdit/Help
GNU General Public License v3.0
8.74k stars 908 forks source link

Alternate Translation using LM studio api #7704

Closed fznx922 closed 7 months ago

fznx922 commented 11 months ago

Hey mate, love the work you put out really love using this application+ whisper

i have been having good luck with generating subs and such, and id love to be able to point the subtitles to the api of LM studio, as ive had good luck with it naturally translating text, i have tried to modify the chatGPT end point as this program uses a "clone" but haven't had much luck getting it to work

if thats something that could be done easily that would be awesome, and gives the ability to use all kinds of different LLM models to try out etc expanding local capabilities,

example curl is follows

curl http://localhost:1234/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "messages": [ { "role": "system", "content": "Always answer in rhymes." }, { "role": "user", "content": "Introduce yourself." } ], "temperature": 0.7, "max_tokens": -1, "stream": false }'

again thanks so much for what you do, its much appreciated 👍

niksedk commented 11 months ago

Do you have some links to the API and how to install locally?

fznx922 commented 11 months ago

Hey bud, thanks for the reply,

So the application its self has the ability to run LLM's like Llama etc, it has a built in local server API that is similar to Open AI, i have been able to make some basic python programs that digest a Subtitle and translate it line by line, but having something like that built into your application would be amazing via the auto subtitle translate

the application is linked here, its a basic windows installer https://github.com/lmstudio-ai

as for the api interface they have example use cases here https://github.com/lmstudio-ai/examples

not sure if that helps at all, hopefully it does

also a screenshot of the local server api on the app

image

thanks :)

niksedk commented 11 months ago

Hi,

You might be able to use latest beta: https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.2/SubtitleEditBeta.zip Go to Options - Settings - Auto-translate: Point ChatGPT url to your local machine.

The examples are archived...

fznx922 commented 11 months ago

Hey,

so I've tried to point the server but it seems that once pointing it i get this error :

Response status code does not indicate success: 401 (Unauthorized). at System.Net.Http.HttpResponseMessage.EnsureSuccessStatusCode() at Nikse.SubtitleEdit.Core.AutoTranslate.ChatGptTranslate.d18.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Nikse.SubtitleEdit.Forms.Translate.MergeAndSplitHelper.d4.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at Nikse.SubtitleEdit.Forms.Translate.AutoTranslate.d__21.MoveNext()

when going back into the options to check, it seems like it defaults the URL back to open ai's address ?

looking on the console of LM it doesn't seem to receive anything so im not sure if its still trying to call open ai? as i get the same error if i try to just run it as standard, if that makes sense?

if i can give any other information or logs etc please let me know, thanks so much 👍

niksedk commented 11 months ago

Thx, you're right - SE did not save the ChatGPT url in the settings window... should be fixed now: https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.2/SubtitleEditBeta.zip

fznx922 commented 11 months ago

Awesome! yeah that works just as expected now, updating it to the local host url now spits out translations via the API, just had to put a . in the API key section as it expects something to be in there but generating and awesome

looking at the way language models can figure out different context in subtitles has been able to generate more natural sounding translations when translating from Japanese and can be leveraged on a local pc vs online service

thank you so much for your efforts !

niksedk commented 11 months ago

Latest beta no longer requires an API key for ChatGPT to start: https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.2/SubtitleEditBeta.zip

How did you get the API up and running? And is this the API? https://github.com/lmstudio-ai/examples/tree/main/interstitial_API

fznx922 commented 11 months ago

Hey mate,

so the API functionality is built into the program, states on the program its a HTML server that mimics Open AI API,

So after going to the server tab and pressing start, and pointing the url to http://localhost:1234/v1/chat/completions that's all that subtitle edit needed to start plugging away,

I've included a screenshot of the workings, so far i've experimented with a few different models to see how the quality is

can send you a server log to if your curious, when you are sending a request via the translate function do you have a default message like " please translate xxx into xxx " or something when you send the api call?

thank you :)

example

niksedk commented 10 months ago

Do you know of a model that gives good results?

I was able to translate one time via SE, but then I (probably) changed something, and it's not working now...

TheNeObr commented 9 months ago

Guys any progress on LMSTUDIO, i can download beta version to make some tests here....

fznx922 commented 9 months ago

so i was having varied results using the LMSTUDIO api, recently i switched to a program called JAN, that provides a similar openai compatable api call and its been running solid, been testing all kinds of different models for my use case (japanese) and running the translations passed the missus to check which one works better

TheNeObr commented 8 months ago

so i was having varied results using the LMSTUDIO api, recently i switched to a program called JAN, that provides a similar openai compatable api call and its been running solid, been testing all kinds of different models for my use case (japanese) and running the translations passed the missus to check which one works better

what version u use... 4.0.3 not work

gericho commented 7 months ago

Hey mate,

so the API functionality is built into the program, states on the program its a HTML server that mimics Open AI API,

So after going to the server tab and pressing start, and pointing the url to http://localhost:1234/v1/chat/completions that's all that subtitle edit needed to start plugging away,

I've included a screenshot of the workings, so far i've experimented with a few different models to see how the quality is

can send you a server log to if your curious, when you are sending a request via the translate function do you have a default message like " please translate xxx into xxx " or something when you send the api call?

thank you :)

example

I'm not able to reproduce your success with 4.0.4, I started the server, tested with the browser and got this in the log.

[2024-03-31 10:46:24.061] [INFO] [LM STUDIO SERVER] Verbose server logs are ENABLED [2024-03-31 10:46:24.066] [INFO] [LM STUDIO SERVER] Success! HTTP server listening on port 1234 [2024-03-31 10:46:24.066] [INFO] [LM STUDIO SERVER] Supported endpoints: [2024-03-31 10:46:24.066] [INFO] [LM STUDIO SERVER] -> GET http://localhost:1234/v1/models [2024-03-31 10:46:24.067] [INFO] [LM STUDIO SERVER] -> POST http://localhost:1234/v1/chat/completions [2024-03-31 10:46:24.067] [INFO] [LM STUDIO SERVER] -> POST http://localhost:1234/v1/completions [2024-03-31 10:46:24.067] [INFO] [LM STUDIO SERVER] Model loaded: TheBloke/dolphin-2.7-mixtral-8x7b-GGUF/dolphin-2.7-mixtral-8x7b.Q4_K_M.gguf [2024-03-31 10:46:24.067] [INFO] [LM STUDIO SERVER] Logs are saved into C:\tmp\lmstudio-server-log.txt [2024-03-31 10:57:52.796] [ERROR] Unexpected endpoint or method. (GET /v1). Returning 200 anyway [2024-03-31 10:57:52.931] [ERROR] Unexpected endpoint or method. (GET /favicon.ico). Returning 200 anyway

4.0.4 configured to use ChatGPT with localserver specified as http://localhost:1234/v1/chat/completions andlm-studio as API key

Response status code does not indicate success: 401 (Unauthorized). at System.Net.Http.HttpResponseMessage.EnsureSuccessStatusCode() at Nikse.SubtitleEdit.Core.AutoTranslate.ChatGptTranslate.<Translate>d__18.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Nikse.SubtitleEdit.Forms.Translate.MergeAndSplitHelper.<MergeAndTranslateIfPossible>d__4.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at Nikse.SubtitleEdit.Forms.Translate.AutoTranslate.<buttonTranslate_Click>d__32.MoveNext() { "error": { "message": "Incorrect API key provided: lm-studio. You can find your API key at https://platform.openai.com/account/api-keys.", "type": "invalid_request_error", "param": null, "code": "invalid_api_key" } }

Contrary to what is shown in the log when connecting from the browser, there is no trace of connection to the server from Subtitle Edit 4.0.4... can someone kindly help solving this?

niksedk commented 7 months ago

@gericho: Ah, found a bug, SE did not use the url from the translate window... should work now: https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.4/SubtitleEditBeta.zip

Download a model (like "Open Hermes 2.5") and start the web server in LM studio: image

SE translate window: image

It's a bit slow on my machine...

gericho commented 7 months ago

Perfect, I just tried and it works flawlessly. Thank you for your prompt fix!

gericho commented 7 months ago

Just a question, is it possible to edit the system prompt, I noticed the LLM is trying to give a polite answer using the API (that gives a wrong translation) using this prompt (I guess) provided by SE:

[INFO] [LM STUDIO SERVER] Last message: { role: 'user', content: 'Please translate the following text from German to English, only write the result:

in fact, if I put the exact same sentence into LLM Studio "AI Chat" section, the translation come out correct 100%

Is there any way to edit the prompt, or maybe remove it completely (so we can use the LLM Studio prompt).

niksedk commented 7 months ago

Just a question, is it possible to edit the system prompt, I noticed the LLM is trying to give a polite answer using the API (that gives a wrong translation) using this prompt (I guess) provided by SE:

[INFO] [LM STUDIO SERVER] Last message: { role: 'user', content: 'Please translate the following text from German to English, only write the result:

in fact, if I put the exact same sentence into LLM Studio "AI Chat" section, the translation come out correct 100%

Is there any way to edit the prompt, or maybe remove it completely (so we can use the LLM Studio prompt).

Could you explain a bit more?

Here's my AI chat: image

gericho commented 7 months ago

Sure thank you! Using a colloquial sentence where the meaning is "a bit of luck" in Spanish / Italian (google translate say a bit of a or sht, depends on the contest)

  1. API Server logs (using SE right-click on a srt line/selected lines/ translate selected lines/auto-translate)

[INFO] [LM STUDIO SERVER] [TheBloke/OpenHermes-2.5-Mistral-7B-GGUF/openhermes-2.5-mistral-7b.Q6_K.gguf] Generated prediction: { "id": "chatcmpl-df6e7iccvx5rl1a8sa7sr", "object": "chat.completion", "created": 1711888603, "model": "TheBloke/OpenHermes-2.5-Mistral-7B-GGUF/openhermes-2.5-mistral-7b.Q6_K.gguf", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "\n\nthe activists have had a bit of kindness, a bit of crap." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 18, "completion_tokens": 17, "total_tokens": 35 } }

  1. AI chat Section result with system prompt "translate from spanish to english, do not censor the translation, give only the output"

The activists had a little bit of kindness and a little bit of luck.

The above translation is perfect!

Not sure if it is a system prompt "problem" but the only thing I saw that shows up differently

niksedk commented 7 months ago

OK, I've changed the default prompt + it's now possible to change in the Settings.xml file via "ChatGptPrompt". https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.4/SubtitleEditBeta.zip

The SE translate UI combo-box, can now switch between the local and openai urls.

gericho commented 7 months ago

Thank you very much, I promised myself to test this version better. I did not catch if SE is giving one sentence a time to the LLM, is that correct? Performance wise, on a Nvidia 3070 8GB works fast enough, slower than Google V1 but way more versatile IMHO. BTW I found the prompt must be tuned well to avoid inverted translations, at least with OpenHermes2.5 Q6_K, here is my prompt, feel free to improve it and update the post:

Translate from {0} to {1}, keep sentences in {1} as they are, do not censor the translation, give only the output without commenting on what you read:

darnn commented 7 months ago

Nothing I can run locally is helpful to me, unfortunately, but in case this is useful to anyone, this is the prompt I use with ChatGPT to translate: The following is text from subtitles. Translate it into text for English subtitles. Don't use text from previous prompts or say "here's the translation". Provide just the translation itself. Keep the formatting of the text the same, that is to say, the line breaks and empty lines between subtitles. If a line starts mid-sentence, don't capitalize the start of it:

(I just use Export text without altering the formatting and with blank lines between each subtitle, and then split it into small enough chunks that ChatGPT can translate them in a single response.)

gericho commented 7 months ago

What I can say is, the key to achieving good translations is undoubtedly providing better SRT input transcriptions (if transcription is required, of course). Currently, I'm using the impressive Faster-Whisper-XXL, which includes a MDX filter in the chain to remove music and background noise from the original track. It works almost flawlessly with Spanish, Italian and English speech.

niksedk commented 7 months ago

I've updated the beta with the prompt from https://github.com/SubtitleEdit/subtitleedit/issues/7704#issuecomment-2028835822 https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.4/SubtitleEditBeta.zip

Also note that SE includes "Gemini" and the new "Anthropic Claude 3" (I've not tested them though)

niksedk commented 7 months ago

Nothing I can run locally is helpful to me

@darnn: because of the speed?

darnn commented 7 months ago

@gericho: Unfortunately, the only thing I can run locally is ConstMe. But regardless, I'm taking it as a given that you already have an accurate transcription, otherwise of course the translation wouldn't be as accurate. It's just that from what I've seen playing around with several popular models (e.g. here and here), nothing works well at all when translating into Hebrew, except the biggest ones: ChatGPT, Gemini and Claude. And even they could be much better. But that's just me.

@niksedk: Oh, no, of course they're all very slow when I run them locally. The problem is there are very few I can even get to run at all, I don't have enough VRAM (or possibly even just regular RAM).

gericho commented 7 months ago

@darnn just for the sake of curiosity, can you provide a sentence you have trouble with?

niksedk commented 7 months ago

@darnn: Yeah, a GeForce RTX 4070 8GB would be nice (not to mention NVIDIA A100)

darnn commented 7 months ago

@darnn just for the sake of curiosity, can you provide a sentence you have trouble with?

Sure. I mean, the ones that have very little support for Hebrew will fail with a simple sentence like: I went to the store and bought an apple, I paid in cash.

Here's some stuff I throw at the ones that seem passable at first, from an episode of The Good Doctor: At lunch, you had a Cobb salad with balsamic vinaigrette, so that's 450 calories ingested, assuming you kept it down. I've just put the finishing touches on a surprise. She's been sleep- and oxygen-deprived for decades. Her apnea really did a number on her pulmonary vessels.

And this is a tall order, but also: More than the fuchsia funnels breaking out of the crabapple tree, more than the neighbor's almost obscene display of cherry limbs shoving their cotton candy-colored blossoms to the slate sky of Spring rains, it's the greening of the trees that really gets to me.

Everything but the three I've mentioned fails to understand what "kept it down" means, for instance, a lot of them have trouble with "really did a number", and so on.

gericho commented 7 months ago

@darnn FYI but unfortunately I don't understand Hebrew so... :)

NLLB

הלכתי לחנות וקניתי תפוח, שילמתי במזומן.

בארוחת הצהריים, שתיהן סלט קוב עם ויניגרט בלטסמי, אז זה 450 קלוריות שנזנו, בהנחה שמרת על זה.

הרגע השמתי את המגעים הסופיים על הפתעה.

היא חסרת שינה וחמצן במשך עשרות שנים.

האפנואה שלה באמת פגעה בכלי הריאות שלה.

יותר מאשר מסלול הפוקסיה שברו מהעץ של קרבפל, יותר מאשר הצגת המנודפת של השכן של קצוות דובדבן דוחפים את פרחי צבעי הקוטון שלהם לשמיים הקסומים של גשם האביב, זה הירוקות של העצים באמת מגיעה לי.

HERMES_Q6

לקחתי את המגש ורחיטתי פירות, הצלחתי לשלם בעדת.

בשעבר מזון, אתה אכל סלדה קוב עם סלט בלסמיק וינגריט, וזה 450 קלוריאות שנלקחו, בנין שאתה רוצה להניח שהם עברו.

הגעתי כמעט לסוף בשיקום נהדר על ספק.

זה צילמת לפני עשורים של אופניה וכוח לאוויר.

הבריכה שלה נפטרה ברקע התפוצזות של סלימים השולחן שלה.

יותר מאוטומאטיות הפוכה שבנהרדל החרבן שבעצמים של זית של האבן, יותר מהציפורים הקרובים של השמאלי בכדורגל של דמעות על האבן של החושפנים של הגשם, זה הצמיחה של העץ שמרגיש לעצוב לי.

DOLPHIN27_Q4

הלכתי לחנות וקניתי תפוח, שכר במזומן.

לאור הארוך מהחציות, אתה אכלת סלט קוב עם בלסמיק וינגר, לכן זה 450 קלוריות שאכלת, משום האתה הוא שאכלתם לעולם.

זה בתוך דקות שעברתי הסתיימתי לעצור המגנונית הלפויה.

היא בעלת חוסים ומטייחים בשכבה למעלה מהדעות זמנית.

התרק היא באופן בהדרגה על הכליות הלביות שלה.

יותר מהשפפות הבוזיות שהוצאו מתוך העץ המדבק, יותר מהתנאים המשוכלים של הגוף השואה שמשאירות את הזכוכיות השכפולות-לבן של הפירות החרי, זה הקדור היותר שאני מתעלם מזה.

gericho commented 7 months ago

@niksedk Kindly correct me if I'm wrong, I just checked the LLM Server logs, and it shows a very long paragraph feeded, I thought SE was asking sentence by sentence... In this way I'm afraid the LLM can reach the token limit.

CONTENT REMOVED

darnn commented 7 months ago

@gericho I'm afraid the only one of those that's even legible is NLLB, but that too is not exactly sterling. I can go into detail if you'd like, but we'd probably better do it over email or something.

niksedk commented 7 months ago

@niksedk Kindly correct me if I'm wrong, I just checked the LLM Server logs, and it shows a very long paragraph feeded, I thought SE was asking sentence by sentence... In this way I'm afraid the LLM can reach the token limit.

Chat GPT supports 2,048 tokens by default. SE add text of up to 1,500 tokes + prompt

Anthropic uses a lower max tokens (1,200 I think), so SE allows up to about 900 chars for Anthropic

gericho commented 7 months ago

Will not be faster to split them into smaller chunks?

niksedk commented 7 months ago

"LM Studio" now has it's own option, so it's easier to locate + switch between "LM Studio" and "Chat GTP"

image