McCloudS / subgen

Autogenerate subtitles using OpenAI Whisper Model via Jellyfin, Plex, Emby, Tautulli, or Bazarr
MIT License
570 stars 49 forks source link

arm64 support #81

Closed otomay closed 5 months ago

otomay commented 5 months ago

Hey o/

Amazing project! Do you plan to add arm64 support? A lot of media servers are hosted in RPIs

tried docker compose up with standard image and :cpu one

 ! subgen The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested 0.0s 
Attaching to subgen
subgen  | exec /bin/bash: exec format error
subgen exited with code 1
McCloudS commented 5 months ago

It wouldn't be worth the effort, you would be CPU and RAM limited. Running something like this on a RPI would take tens of hours in all likelihood on a 20 minute file. If you want to give it a shot though, I can try pushing an image in a bit.

McCloudS commented 5 months ago

Docker image should be up soon, you'll see it @ https://hub.docker.com/r/mccloud/subgen/tags. You should just be able to re-pull with :cpu again and it should pull the correct ARM one. I don't have an arm machine to test, so it's all on you. I'm not sure if all the packages have ARM versions or not.

Edit: Investigating some issues with matrixed builds.

McCloudS commented 5 months ago

Arm64 built but I’m skeptical of the build size. Give it a shot. Arm6 and 7 failed, probably missing packages for those architectures.

otomay commented 5 months ago

Wow, that was fast! Thanks, man!

I got no error on logs, but I can't access the UI in localhost:9001. I changed - "WEBHOOKPORT=9001" and `ports:

subgen.env file not found. Please run prompt_and_save_env_variables() first.
subgen.py exists and UPDATE is set to False, skipping download.
Launching subgen.py
File subgen.env not found. Environment variables not set.
File subgen.env not found. Environment variables not set.
INFO:root:Subgen v2024.4.10.72
INFO:root:Starting Subgen with listening webhooks!
INFO:root:Transcriptions are limited to running 2 at a time
INFO:root:Running 4 threads per transcription
INFO:root:Using cpu to encode
INFO:root:Using faster-whisper
INFO:     Started server process [8]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9001 (Press CTRL+C to quit)

Anyways, I set the variables in the docker compose file. I Set it up in bazarr as a provider, but I don't know how to check if it's working. Is it going to show some logs when bazarr send a request to transcribe?

McCloudS commented 5 months ago

You should be good to go if you set: "9001:9001"

It will log anything from Bazarr, your best bet is trying to a manual search and selecting whisper (always at the bottom of the list).

otomay commented 5 months ago

You should be good to go if you set: "9001:9001"

Oh, cool, it worked. I started manual download and it seems to be working,

Transcribe:   9% 112.52/1298.05 [05:57<1:01:48,  3.13s/sec]INFO:root:Force Update...

I think it would be about an hour for 20m episode. But I think it's transcribing for every language that is desired, cuz I only asked 1 manual episode and I see more than 1 transcribing running on container logs.

Bazarr desired languages: image

Bazarr logs show just one request to whisper. Is it expected?

McCloudS commented 5 months ago

The queuing and threading is a little wonky, so what displays can vary. What probably happened is that Bazarr auto queued other stuff if it met criteria.

Any other updates? I'm a little stunned it's working. Was the image really only ~800mb?

otomay commented 5 months ago

Oh, I see. IDK tho. In my tests, bazzar rated whispar as 66% and my minimum score is 70%. image

I reduced the concurrency to 1, threads to 2 and the logs are easier to read:

INFO:root:Transcribing file from Bazarr/ASR webhook
INFO:faster_whisper:Processing audio with duration 21:32.245
Detected Language: english
Transcribing with faster-whisper (medium)...

Transcribe:   0% 0/1292.25 [00:00<?, ?sec/s]
Transcribe:   0% 4.64/1292.25 [00:22<1:43:30,  4.82s/sec]INFO:root:Force Update...
Transcribe:   2% 30.02/1292.25 [00:50<32:02,  1.52s/sec] INFO:root:Force Update...
Transcribe:   4% 54.7/1292.25 [01:31<32:52,  1.59s/sec] INFO:root:Force Update...
Transcribe:   6% 81.84/1292.25 [02:16<32:59,  1.64s/sec]INFO:root:Force Update...
Transcribe:   9% 110.54/1292.25 [03:10<34:15,  1.74s/sec]INFO:root:Force Update...
[...]
Transcribe: 100% 1290.06/1292.25 [31:09<00:03,  1.75s/sec]INFO:root:Force Update...
INFO:root:Bazarr transcription is completed, it took 31 minutes and 47 seconds to complete.

(better results with less concurrency. 35 min-ish to transcribe a 20m file) After finishing this manual one, there's no other transcribes starting. IDK what happened before.

Any other updates? I'm a little stunned it's working. Was the image really only ~800mb?

hehehehe, I'm glad it is! Thanks for helping! The image has 2GB:

> docker image ls | grep subgen
mccloud/subgen                                         cpu            cb67f7be3fdf   6 hours ago    1.98GB

I tested on jellyfin. It shows as "SUBRIP - External". Perfect syncinc and transcribing.

McCloudS commented 5 months ago

From https://wiki.bazarr.media/Additional-Configuration/Whisper-Provider/: Minimum score must be lowered if you want whisper generated subtitles to be automatically "downloaded" because they have a fixed score which is 241/360 (~67%) for episodes and 61/120 (~51%) for movies. So you shouldn't be getting anything to Whisper beyond your manuals.

Are you running a RPI or some other NAS?

otomay commented 5 months ago

This explains, thanks!

I'm running on an ampere A1 Compute from OCI IDK how many CPU or RAM % Subgen uses, cuz I got a lot of things self hosted in it. But I can check it tomorrow if you want :P

Also, I think this issue can be closed if you wish. TYSM!