Open kslen opened 11 months ago
Hi, thanks for raising this!
Do you see log entries like the following from the STT container after speaking?
whisper_init_state: kv self size = 15.75 MB
whisper_init_state: kv cross size = 52.73 MB
Do you see a spike in CPU usage after speaking (for example through htop)?
What are your hardware specs? Is it virtualized or bare metal?
Redis not finding a config is to be expected, but I am surprised by the connection, given that the bot container is instructed to wait for Redis to start first. Can you try pinging Redis by running the following command docker exec discord-live-translator-redis-1 redis-cli PING
? It should return "PONG".
Hello again and thanks for a quick response! :D
I'll start off with your initial question, and no, no such entries can be seen in the log.
Regarding specs; it's a KVM with 1 core and initially 2 GB of memory, but increased to 8 GB see if it was a resource issue, but unfortunately with no change in outcome.
The host hardware is a Ryzen 7 3900x with 128 GB 3200 MHz DDR4 ECC. Memory is well managed and nowhere near capacity. Storage consists of 8x1 TB 7200 RPM 64MB Cache drives using raidz2 with lz4 compression enabled. The guest storage is currently qcow2, so all in all I would like to think that it is close to optimal in its configuration when aiming for performance and based on what it is. :D
I've reviewed the host performance in the period leading up to and including the crash which I just provoked. There's no iodelay of note and as it's my home server, there's not much load due to me and my machines being its only users. During the test i just performed, the hosts CPU utilization peaked at 4%.
Here's a screenshot of the KVMs performance graphs where the tiny spike at the end is when I invited the bot into a voice channel and started talking:
The spike and over all increase in resource utilization in the screenshot above is due to me installing and then starting glances on the KVM to see if it would produce any warnings, which unfortunately it did not as seen here after the crash:
I can't see anything else in journalctl on the host nor guest that strikes me as relevant or would fall into the error category, but will provide output from the same time period if requested. :)
docker container ls after crash:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b3587503758b ghcr.io/wotschofsky/discord-live-translator "docker-entrypoint.s…" 18 hours ago Up 3 minutes discord-live-translator-bot-1
7e414ea3e796 libretranslate/libretranslate:v1.3.11 "./venv/bin/libretra…" 18 hours ago Up 17 hours 5000/tcp discord-live-translator-libretranslate-1
3bc2770124ae redis:alpine "docker-entrypoint.s…" 18 hours ago Up 17 hours 6379/tcp discord-live-translator-redis-1
34fdc03d4511 ghcr.io/wotschofsky/discord-live-translator/stt "./stt" 18 hours ago Up 3 minutes 8000/tcp discord-live-translator-stt-1
Unfortunately, I forgot to ping redis before provoking the crash, but it replied after so I would think it would have before as well.
dlt@bot:~$ docker exec discord-live-translator-redis-1 redis-cli PING
PONG
ghcr.io/wotschofsky/discord-live-translator log:
[2023-08-30T00:44:37.225Z] Logged in as Super Secret Bot#1337!
[2023-08-30T17:22:11.020Z] Started recording 301695752688893952 to /app/cache/rec/301695752688893952_1693416131018.wav
[2023-08-30T17:22:12.078Z] Successfully finished recording 301695752688893952 to /app/cache/rec/301695752688893952_1693416131018.wav
[2023-08-30T17:22:12.079Z] Analyzing "/app/cache/rec/301695752688893952_1693416131018.wav"...
[2023-08-30T17:22:13.429Z] Started recording 301695752688893952 to /app/cache/rec/301695752688893952_1693416133428.wav
TypeError: fetch failed
at Object.fetch (node:internal/deps/undici/undici:11576:11)
at recognizeRecording (/app/src/processors/recognizeRecording.ts:11:20)
at _SpeakingMap.<anonymous> (/app/src/commands/join.ts:99:30)
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
yarn run v1.22.19
$ node -r source-map-support/register dist/app.js
[2023-08-30T17:22:17.150Z] Logged in as Super Secret Bot#1337!
libretranslate/libretranslate:v1.3.11 log:
Running on http://0.0.0.0:5000
No other information outside of this line is produced when restarting the container, and no errors related to the crash present.
redis:alpine log:
1:M 30 Aug 2023 17:21:15.309 * 1 changes in 3600 seconds. Saving...
1:M 30 Aug 2023 17:21:15.310 * Background saving started by pid 15
15:C 30 Aug 2023 17:21:15.377 * DB saved on disk
15:C 30 Aug 2023 17:21:15.377 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
1:M 30 Aug 2023 17:21:15.410 * Background saving terminated with success
ghcr.io/wotschofsky/discord-live-translator/stt log:
whisper_init_from_file_no_state: loading model from 'models/ggml-small.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 3
whisper_model_load: mem required = 743.00 MB (+ 16.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
2023-08-30T17:22:15.506Z INFO [stt] Starting server at http://127.0.0.1:8000
2023-08-30T17:22:15.506Z INFO [actix_server::builder] starting 1 workers
2023-08-30T17:22:15.506Z INFO [actix_server::server] Actix runtime found; starting in Actix runtime
Hope this is of use. :)
Can you please try resolving the internal ip for the stt container like follows:
docker exec -it discord-live-translator-bot-1 bash
apt install curl dnsutils iputils-ping
dig stt
, ping stt
and curl --head stt:8000
and post the resultsHey!
Pardon the late response. Been busy days.
Edit of shame: Brain started ticking over and here's the output as requested. :D
root@b3587503758b:/app# dig stt
; <<>> DiG 9.18.16-1~deb12u1-Debian <<>> stt
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30371
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;stt. IN A
;; ANSWER SECTION:
stt. 600 IN A 172.18.0.3
;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11) (UDP)
;; WHEN: Mon Sep 04 23:40:35 UTC 2023
;; MSG SIZE rcvd: 40
root@b3587503758b:/app# ping stt
PING stt (172.18.0.3) 56(84) bytes of data.
64 bytes from discord-live-translator-stt-1.discord-live-translator_default (172.18.0.3): icmp_seq=1 ttl=64 time=0.099 ms
64 bytes from discord-live-translator-stt-1.discord-live-translator_default (172.18.0.3): icmp_seq=2 ttl=64 time=0.048 ms
64 bytes from discord-live-translator-stt-1.discord-live-translator_default (172.18.0.3): icmp_seq=3 ttl=64 time=0.033 ms
64 bytes from discord-live-translator-stt-1.discord-live-translator_default (172.18.0.3): icmp_seq=4 ttl=64 time=0.052 ms
64 bytes from discord-live-translator-stt-1.discord-live-translator_default (172.18.0.3): icmp_seq=5 ttl=64 time=0.042 ms
64 bytes from discord-live-translator-stt-1.discord-live-translator_default (172.18.0.3): icmp_seq=6 ttl=64 time=0.034 ms
64 bytes from discord-live-translator-stt-1.discord-live-translator_default (172.18.0.3): icmp_seq=7 ttl=64 time=0.050 ms
64 bytes from discord-live-translator-stt-1.discord-live-translator_default (172.18.0.3): icmp_seq=8 ttl=64 time=0.036 ms
^C
--- stt ping statistics ---
8 packets transmitted, 8 received, 0% packet loss, time 7155ms
rtt min/avg/max/mdev = 0.033/0.049/0.099/0.020 ms
root@b3587503758b:/app# curl --head stt:8000
HTTP/1.1 405 Method Not Allowed
content-length: 0
allow: POST
date: Mon, 04 Sep 2023 23:42:20 GMT
Hi, thanks for your patience! From the command output, it doesn't seem like Docker is at fault here. I've heard from friends that Node.js 18 fetch is still a bit weird, so I have upgraded to Node.js 20.
Can you try updating the images and see if that fixes it?
docker compose -f docker-compose.prod.yml pull
docker compose -f docker-compose.prod.yml up -d
Hello again, and no problem. Take the time you need. :)
Unfortunately, no dice.
[2023-09-11T01:02:04.093Z] Logged in as Super Secret Bot#1337!
Successfully updated guild (/) commands.
Successfully initialized guild commands!
[2023-09-11T01:02:59.305Z] Started recording 301695752688893952 to /app/cache/rec/301695752688893952_1694394179304.wav
[2023-09-11T01:03:00.357Z] Successfully finished recording 301695752688893952 to /app/cache/rec/301695752688893952_1694394179304.wav
[2023-09-11T01:03:00.357Z] Analyzing "/app/cache/rec/301695752688893952_1694394179304.wav"...
[2023-09-11T01:03:00.575Z] Started recording 301695752688893952 to /app/cache/rec/301695752688893952_1694394180575.wav
[2023-09-11T01:03:00.893Z] Successfully finished recording 301695752688893952 to /app/cache/rec/301695752688893952_1694394180575.wav
[2023-09-11T01:03:00.894Z] Analyzing "/app/cache/rec/301695752688893952_1694394180575.wav"...
TypeError: fetch failed
at Object.fetch (node:internal/deps/undici/undici:11600:11)
at recognizeRecording (/app/src/processors/recognizeRecording.ts:11:20)
at _SpeakingMap.<anonymous> (/app/src/commands/join.ts:99:30)
ELIFECYCLE Command failed with exit code 1.
> discord-live-translator@1.0.0 start /app
> node -r source-map-support/register dist/app.js
Started updating guild (/) commands.
Successfully updated guild (/) commands.
Successfully initialized guild commands!
[2023-09-11T01:03:04.669Z] Logged in as Super Secret Bot#1337!
1:C 11 Sep 2023 01:01:56.243 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 11 Sep 2023 01:01:56.243 * Redis version=7.2.1, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 11 Sep 2023 01:01:56.243 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
1:M 11 Sep 2023 01:01:56.244 * monotonic clock: POSIX clock_gettime
1:M 11 Sep 2023 01:01:56.244 * Running mode=standalone, port=6379.
1:M 11 Sep 2023 01:01:56.245 * Server initialized
1:M 11 Sep 2023 01:01:56.245 * Loading RDB produced by version 7.2.0
1:M 11 Sep 2023 01:01:56.245 * RDB age 15 seconds
1:M 11 Sep 2023 01:01:56.245 * RDB memory usage when created 0.85 Mb
1:M 11 Sep 2023 01:01:56.245 * Done loading RDB, keys loaded: 1, keys expired: 0.
1:M 11 Sep 2023 01:01:56.245 * DB loaded from disk: 0.000 seconds
1:M 11 Sep 2023 01:01:56.245 * Ready to accept connections tcp
2023-09-11T01:01:56.137Z INFO [stt] Starting server at http://127.0.0.1:8000
2023-09-11T01:01:56.217Z INFO [actix_server::builder] starting 1 workers
2023-09-11T01:01:56.217Z INFO [actix_server::server] Actix runtime found; starting in Actix runtime
whisper_init_from_file_no_state: loading model from 'models/ggml-small.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 768
whisper_model_load: n_text_head = 12
whisper_model_load: n_text_layer = 12
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 3
whisper_model_load: mem required = 743.00 MB (+ 16.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
2023-09-11T01:03:02.779Z INFO [stt] Starting server at http://127.0.0.1:8000
2023-09-11T01:03:02.779Z INFO [actix_server::builder] starting 1 workers
2023-09-11T01:03:02.779Z INFO [actix_server::server] Actix runtime found; starting in Actix runtime
Updating language models
Found 60 models
Keep 8 models
Downloading English → French (1.0) ...
Downloading English → German (1.0) ...
Downloading English → Italian (1.0) ...
Downloading English → Spanish (1.0) ...
Downloading French → English (1.0) ...
Downloading German → English (1.0) ...
Downloading Italian → English (1.0) ...
Downloading Spanish → English (1.0) ...
Loaded support for 5 languages (8 models total)!
Running on http://0.0.0.0:5000
Running on http://0.0.0.0:5000
Running on http://0.0.0.0:5000
Running on http://0.0.0.0:5000
Running on http://0.0.0.0:5000
Running on http://0.0.0.0:5000
Running on http://0.0.0.0:5000
Running on http://0.0.0.0:5000
Could this be some missing dependency thing? Again, my ignorance may shine through, but the Debian install is just base + ssh server, then I've installed as per procedure in README.md, or at least to the best of my recollection. #weeksof4hournights :D
Maybe it would be of use if you could provide me with a complete list of packages installed in your containers where you have it running successfully? At least then we are closer to being sure that we're not chasing ghosts of my making here.
Again, thanks for your ongoing support. Our Spanish team mate will absolutely love this thing if we can get it to start talking. :D
Hello again!
Sorry, I really don't want to nag, but would it be possible to get some more information about the OS and versions of packages you're able to get this running on?
I'm not married to running any configuration, I just really, really want to have this bot working as it would make our teams current issues regarding communication much less of a problem.
Thanks again!
Hi, sure:
Here are the two configurations that work for me:
In both cases running the latest versions of Docker. I hope this helps!
I've been considering moving this app to the Bun runtime. Maybe that would solve the issue or at least give some better errors.
Alrighty, put that plan on hold and I'll try to replicate your environment hopefully by the end of the weekend. Enjoy yours! :)
Ok, so we've lost the spanish guy, so our need for the bot seems to have evaporated.
If I find time (as I've surprisingly not been able to to date), I'll keep my promise to see if this is related to Debian 12, but yeah. May take a while.
This would be my suggestion if anyone else is struggling, and if you find that it does work - please relay. :)
Hello there!
I'm attempting to get your creation up and running on a Debian 12.1 install as per instructions in README.md. The install is up to date and the box has plenty of free storage and memory as the problem I'm about to delve into occur. vm.overcommit_memory = 1 has also been set and verified as part of my blind stab at resolving this issue.
I've successfully managed to get the bot onto our server, commands are being recognized and all seems well up to this point. The bot acknowledges our desired language, joins the voice channel when requested and plays its little fanfare. :)
However, a few errors are thrown during startup.
ghcr.io/wotschofsky/discord-live-translator log:
Otherwise the logs look alright besides of redis lusting for a config.
Redis log:
Libretranslate log:
ghcr.io/wotschofsky/discord-live-translator/stt log:
The problems begin as someone starts speaking. One can see in the logs that voice is being recorded, but the following error pops up in the ghcr.io/wotschofsky/discord-live-translator log:
Any thoughts on what might be going on and how to resolve it? I'm a complete newbie to docker, so I figured before I poke at things further; I'd consult the oracles. :D
Best regards, yours truely.