rhasspy / wyoming-addons

Docker builds for Home Assistant add-ons using Wyoming protocol
MIT License
66 stars 21 forks source link

[WIP] Add initial GPU support #4

Open edurenye opened 1 year ago

edurenye commented 1 year ago

This is a work in progress. I think for whisper it is working, but I'm not sure how to check it. And for piper it is giving me an error unrecognized arguments: --cuda, but I got the instructions from here: https://github.com/rhasspy/piper At the end it says that it should work just installing onnxruntime-gpu and running piper with the --cuda argument.

What am I missing?

I guess this will conflict with those that just want to use the CPU, how can we handle that? Making different images? Ex: piper and piper-gpu

edurenye commented 1 year ago

Closes #3

DBaker85 commented 1 year ago

Just wanted to leave my 2cents here: I tried your whisper changes locally and it is working perfectly on my 1080ti and Docker. VRam is assigned and the container works as well. Home assistant also recognised and used it perfectly. Nice one!

(Did not try Piper)

edurenye commented 1 year ago

Piper does not work because of this: https://github.com/rhasspy/rhasspy3/issues/49

wdunn001 commented 1 year ago

Whisper is still targeting 20.04 is there a reason for that?

wdunn001 commented 1 year ago

This may need to be its own image since the majority of users would not want the cuda version

wdunn001 commented 1 year ago

could this be split into 2 tickets one for whisper and one for piper. The whisper portion is in reality the more useful of the two and benefits more from this feature. If piper is experiencing issues.

edurenye commented 1 year ago

@wdunn001 From the documentation https://github.com/guillaumekln/faster-whisper/ it says it requires cuDNN 8 for CUDA 11, and for those versions of CUDA and cuDNN the highest version of ubuntu available is 20.04, and I had to look for it because it was not working with the image I set for the other containers sadly. And updating to CUDA 12 is not planned in the very short term. See an explanation here: https://github.com/guillaumekln/faster-whisper/issues/47#issuecomment-1620086696.

edurenye commented 1 year ago

Sorry, editing because I missunderstood your comment. Yes, makes sense to make it 2 different images, I can add that.

But I guess for better maintainability the solution we add for one should be the same as for the others, for that is I think is better to have the conversation in a single issue and PR. If you need to use it right now you can just add the changes to your local Dockerfile and build it. Or if you need to use CUDA 12 you could try the workarounds that they comment in here: https://github.com/guillaumekln/faster-whisper/issues/153#issuecomment-1510218906

edurenye commented 1 year ago

And I'll try to add porcupine1 too

wdunn001 commented 1 year ago

Awesome! I am happy to help if you need anything. Would we want to add the docker arguments for the CUDA image to the documentation here?

edurenye commented 1 year ago

I added the changes. I have not tested the new porcupine1 container, since that software does not support my language yet.

And yes, ofc we should document this, also I was thinking should we add a docker-compose.yml file? It made sense for me since I use home assistant and need the 3 services. But now that porcupine1 has been added I am not sure anymore since as far as I know porcupine1 and openwakeword do the same, which is quite confusing for me.

edurenye commented 1 year ago

But in the README.md file right now there is just the documentation for using it pulling the images, not building them, so that will depend on the tags the maintainer might wanna use. Should we add building instructions to the README.md file?

wdunn001 commented 1 year ago

I think so for sure we can create a contributors section. I'll work on it I will be building it for the first time this weekend so I'll try and document the process.

edurenye commented 1 year ago

I will give you the docker-compose files and a starting point.

edurenye commented 1 year ago

I just added it, tell me how it works for you, you can create your own docker-compose.x.yml file for your use case.

I have not added porcupine1 to the docker compose because it uses the same port as openwakeword, so for that particular case it could be added in the custom extend file.

wdunn001 commented 1 year ago

ok so I am getting an error deploying this via compose or run

usage: main.py [-h] --model {tiny,tiny-int8,base,base-int8,small,small-int8,medium,medium-int8} --uri URI --data-dir DATA_DIR [--download-dir DOWNLOAD_DIR] [--device DEVICE] [--language LANGUAGE] [--compute-type COMPUTE_TYPE] [--beam-size BEAM_SIZE] [--debug] main.py: error: the following arguments are required: --model, --uri, --data-dir /run.sh: line 3: --uri: command not found /run.sh: line 4: --data-dir: command not found /run.sh: line 5: --download-dir: command not found

It needs additional params in contrast with the other build.

These appear to be supplied by the run.sh file and I see its called in the Dockerfile.

I added commands to the GPU compose file identical to those in the NOGPU version and they work fine and made a pr. Its only the ones in the run.sh that seem to not work.

I am on Ubuntu 22.04 with latest docker is that matters.

edurenye commented 1 year ago

This is weird, according to the documentation, the only thinks not extended should be volumes_from and depends_on. We can follow this discussion in the PR that you created https://github.com/edurenye/wyoming-addons-gpu/pull/1

AnkushMalaker commented 1 year ago

I needed to add --device cuda to actually load the whisper model onto my GPU. I second that we could split this into different branches to handle GPU for whisper, piper and wakeword. I made a branch for that, not sure if I should raise this as a PR.

New to contributing, happy to hear thoughts.

https://github.com/AnkushMalaker/wyoming-addons/tree/gpu

edurenye commented 1 year ago

I rebased with the last chnages from master and the typos in the readme file.

I don´t think we need to create another branch for the meanwhile you can just have an extend file where you use GPU options for whisper and openwakeword and nongpu for piper.

And regarding /var/data, I am generally against storing user data in a system folder. And passing all the folder to the docker container might load a lot of data that is not needed from other applications.

wdunn001 commented 1 year ago

@edurenye agreed using cpu for piper seems to be more than sufficient. I am still experiencing issues with openwakeword but it may just be my environment. I'll pull down the changes here and try again. I'll push any fixes I find to the PR on your branch.

Maxcodesthings commented 1 year ago

I have tried applying the contents of this PR to my local instance. I do not see the faster-whisper implementation use GPU over CPU.

I have conflated the dockerfiles as such and focused on only using GPU for whisper container:

  whisper:
    container_name: whisper
    build:
      context: /opt/wyoming-addons/whisper/
      dockerfile: GPU.Dockerfile
    # image: rhasspy/wyoming-whisper:latest
    restart: unless-stopped
    ports:
      - 10300:10300
    volumes:
      - /opt/homeassistant/whisper:/data
    command: 
      - --model
      - medium-int8
      - --language
      - en
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

I can tell my GPU is passed through because it appears in nvidia-smi on the container NVIDIA_Share_gj0fFevK7R

However when watching GPU when processing my speech the usage does not increase, and when watching CPU the usage clearly spikes since it's the CPU processing my speech

How have you all tested that this implementation of faster-whisper is working? I would like to do the same on my machine

Edit:

Found the issue!

You are missing --device in your compose

command: 
      - --model
      - small
      - --language
      - en
      - --device
      - cuda
edurenye commented 1 year ago

Good finiding! Was not documented, but that parameter exists in https://github.com/rhasspy/wyoming-faster-whisper/blob/master/wyoming_faster_whisper/__main__.py

mreilaender commented 1 year ago

Can u resolve the conflicts? I would love to see the improvements from using the GPU directly :)

mreilaender commented 12 months ago

Doesn't work with piper since wyoming-piper doesn't declare the --cuda argument. I created a PR

hrfried commented 11 months ago

Can confirm it works pulling commit from @edurenye and replacing the 2 files from your (@mreilaender) commit in the docker image, once it was already built and failing to run. Awesome job y'all! Hope these get merged quickly before I forget about the Frankentainer setup I had to do to get it running. :)

hrfried commented 11 months ago

Can also confirm it's pretty much instantaneous on my endpoint, running from my android homeassistant app, into my homeassistant setup (rpi4, 8 gigs ram running dietpi) using my gentoo desktop as the endpoint (8 core i7-9700 and RTX 4060 Ti 16 gig version). Now just need to set up listener endpoints to make the wake words work and I'm one step closer to being fully local...Now it would be really cool to integrate one of the open LLM models as a separate endpoint for some "real" voice assistant capabilities, haha. Doesn't seem too difficult in theory..

edurenye commented 11 months ago

Thanks for testing :heart: I'm going to be away for 2 weeks, I hope that in the meantime this issue gets fixed: https://github.com/rhasspy/wyoming-piper/pull/5 Then I'll resolve the conflicts and test again if everything works.

bkbilly commented 10 months ago

I am using Ubuntu 22.04 with an NVIDIA GeForce GTX 660 with a cuda version of 11.4.

I've been using the nvidia gpu on other docker files like frigate and it works. I changed the docker-compose file for the whisper like so:

version: "3"
services:
  whisper_en:
    container_name: whisper_en
    command: [ "--model", "base-int8", "--language", "en", "--device", "cuda" ]
    restart: unless-stopped
    ports:
      - 10300:10300
    environment:
      - TZ=Europe/Athens
    volumes:
      - ./wyoming/whisper_en:/data
    build:
      context: ./wyoming-addons-gpu/whisper/
      dockerfile: GPU.Dockerfile
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

I get these errors when I start talking:

[2023-12-20 13:04:54.605] [ctranslate2] [thread 7] [warning] The compute type inferred from the saved model is int8_float32, but the target device or backend do not support efficient int8_float32 computation. The model weights have been automatically converted to use the float32 compute type instead.
INFO:__main__:Ready
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-5' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.8/dist-packages/wyoming/server.py:28> exception=RuntimeError('cuDNN failed with status CUDNN_STATUS_NOT_INITIALIZED')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/wyoming/server.py", line 35, in run
    if not (await self.handle_event(event)):
  File "/usr/local/lib/python3.8/dist-packages/wyoming_faster_whisper/handler.py", line 75, in handle_event
    text = " ".join(segment.text for segment in segments)
  File "/usr/local/lib/python3.8/dist-packages/wyoming_faster_whisper/handler.py", line 75, in <genexpr>
    text = " ".join(segment.text for segment in segments)
  File "/usr/local/lib/python3.8/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 162, in generate_segments
    for start, end, tokens in tokenized_segments:
  File "/usr/local/lib/python3.8/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 186, in generate_tokenized_segments
    result, temperature = self.generate_with_fallback(segment, prompt, options)
  File "/usr/local/lib/python3.8/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 279, in generate_with_fallback
    result = self.model.generate(
RuntimeError: cuDNN failed with status CUDNN_STATUS_NOT_INITIALIZED

I've tried installing on my host the cuDNN package, but the error still persists. I would really like to be able to use my GPU for the speech-to-text because it will run so much better and faster.

baudneo commented 10 months ago

I put in a PR to @edurenye gpu branch that adds the piper --cuda arg PR fix from @mreilaender to allow piper to use CUDA accel. This should only be needed until wyoming-piper adds the --cuda arg. All I did was put the 2 PR's together, all credit goes to @edurenye and @mreilaender

https://github.com/baudneo/wyoming-addons-gpu/tree/gpu

PR: https://github.com/edurenye/wyoming-addons-gpu/pull/2

edurenye commented 10 months ago

Thank you @baudneo ! I merged your PR.

We should remove the changes when piper args get added to the library.

baudneo commented 10 months ago

So, after testing this weekend, it seems to me piper is not using GPU even with the cusotm python code.

The --use_cuda flag isnt in any of the piper releases (current docker build uses v1.2.0, latest release is 2023.11.14-2). It is added in the master branch (https://github.com/rhasspy/piper/commit/6c5e283439f8400aa7a2652aafbedfb77ca3fc22). The only way to get the new CUDA accelerated piper is to build it yourself and then install piper python libs modified by @mreilaender PR to expose the --cuda python arg.

I have just whipped up a custom branch that builds piper using multi image build (build in cudnn-devel, copies to cudnn-runtime). I have built piper locally, so I know it works, just need to roll out docker builds. I am just about to test docker builds.

I also added a closed PR in wyoming-faster-whisper that will allow using any compatible (read: CTranslate2 compatible) models instead of just the models available in rhasspy/models repo. This will give us access to v2-large and any other ASR models that are compatible on HF (Distill-whisper, etc.).

wyoming-faster-whipser with HF ASR model ability: https://github.com/baudneo/wyoming-faster-whisper/tree/hf_asr_models

build CUDA accel piper: https://github.com/baudneo/wyoming-addons-gpu/tree/build_piper

I havent tested either build and deploy yet, so maybe wait until I test them. If things work out, piper should now be CUDA accel and faster-whisper should allow downloading HF models for use.

Edit: I havent drilled down into openwakeword but, from my cursory look, it doesnt seem openwakeword is CUDA accelerated. The docker container has the GPU exposed and available but, I dont think openwakeowrd is using the GPU. Same thing for piper in the current docker builds, the GPU is available but piper isnt using it because it is piper v1.2.0 and not a master branch build of piper.

Edit 2: OK, I am wrong about openwakeword -> https://github.com/search?q=repo%3Adscripka%2FopenWakeWord%20cuda&type=code - It should be using GPU as long as the GPU is available in the container. The only issue I can see is that the lib assumes you only want to use device 0, so it will only ever use GPU index: 0.

baudneo commented 10 months ago

The gpu accel piper works. Same build process as previous, simply clone my wyoming-addons-gpu fork checkout the build_piper branch and run docker compose -f docker-compose.gpu.yml up and it should build it for you.

Build it

git clone https://github.com/baudneo/wyoming-addons-gpu.git -b build_piper
cd wyoming-addons-gpu
docker compose -f docker-compose.gpu.yml up -d
# Check logs
docker compose -f docker-compose.gpu.yml logs -f

GPU mem before

Tue Dec 26 18:23:33 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1660 Ti     Off | 00000000:3B:00.0 Off |                  N/A |
|  0%   41C    P2              24W / 130W |   1859MiB /  6144MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     81077      C   /usr/bin/zmc                                 70MiB |
|    0   N/A  N/A    193753      C   /opt/zomi/server/venv/bin/python3           474MiB |
|    0   N/A  N/A   2115758      C   python3                                     908MiB |
|    0   N/A  N/A   3022347      C   /usr/bin/zmc                                246MiB |
|    0   N/A  N/A   3022370      C   /usr/bin/zmc                                158MiB |
+---------------------------------------------------------------------------------------+

GPU mem after

Sun Jan  7 17:29:23 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1660 Ti     Off | 00000000:3B:00.0 Off |                  N/A |
| 23%   42C    P2              24W / 130W |   2291MiB /  6144MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A    193753      C   /opt/zomi/server/venv/bin/python3           474MiB |
|    0   N/A  N/A    720284      C   /usr/bin/zmc                                246MiB |
|    0   N/A  N/A    720304      C   /usr/bin/zmc                                158MiB |
|    0   N/A  N/A    720344      C   /usr/bin/zmc                                 70MiB |
|    0   N/A  N/A   1507862      C   python3                                    1340MiB |
+---------------------------------------------------------------------------------------+

So, 908MB before with just whisper 'medium-int8' and 1340MB after with piper and whisper 'medium-int8' loaded.

BlackBeltPanda commented 10 months ago

The gpu accel piper works. Same build process as previous, simply clone my wyoming-addons-gpu fork checkout the build_piper branch and run docker compose -f docker-compose.gpu.yml up and it should build it for you.

Trying your fork, I ended up with this error:

ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-6' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.10/dist-packages/wyoming/server.py:28> exception=FileNotFoundError(2, 'No such file or directory')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/wyoming/server.py", line 35, in run
    if not (await self.handle_event(event)):
  File "/usr/local/lib/python3.10/dist-packages/wyoming_piper/handler.py", line 98, in handle_event
    wav_file: wave.Wave_read = wave.open(output_path, "rb")
  File "/usr/lib/python3.10/wave.py", line 509, in open
    return Wave_read(f)
  File "/usr/lib/python3.10/wave.py", line 159, in __init__
    f = builtins.open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: ''
ser commented 9 months ago

if someone is interested in running whisper on a GPU without docker, you need CUDA 11.8 libraries at the moment:

  1. follow installation process https://github.com/SYSTRAN/faster-whisper
  2. install wyoming_faster_whisper from pypi
  3. run the script:
export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`
exec python3 -m wyoming_faster_whisper \
        --uri 'tcp://0.0.0.0:10300' \
        --device cuda \
        --model medium-int8 \
        --beam-size 10 \
        --language 'en' \
        --data-dir /home/whiz/data \
        --download-dir /home/whiz/data \
        --debug

on RTX 4070 recognition takes microseconds <3

dbolger commented 9 months ago

It'd be neat to get some OpenCL (AMD) support for this.

ser commented 9 months ago

It'd be neat to get some OpenCL (AMD) support for this.

it's impossible, unfortunately, opencl does not work with whisper

dbolger commented 9 months ago

It'd be neat to get some OpenCL (AMD) support for this.

it's impossible, unfortunately, opencl does not work with whisper

It looks like there's a derivative of whisper called whisper.cpp here, which mentions partial OpenCL support

ser commented 9 months ago

have you tried it on AMD card or you just chatgpt here?

baudneo commented 9 months ago

have you tried it on AMD card or you just chatgpt here?

Usually amd cards are supported for pytorch and onnxruntime-gpu using rocm 5.4. I don't see any code that uses the rocm provider for onnxruntime, I can create a branch that adds that support for you to test later if you want?

dbolger commented 9 months ago

have you tried it on AMD card or you just chatgpt here?

I haven't tried it, but by the number of comments in this issue mentioning CUDA, I suspected this would only be supported on NVIDIA hardware.

If you're referring to trying whisper.cpp, I have not either.

dbolger commented 9 months ago

Usually amd cards are supported for pytorch and onnxruntime-gpu using rocm 5.4. I don't see any code that uses the rocm provider for onnxruntime, I can create a branch that adds that support for you to test later if you want?

I'd be interested in trying it out!

ser commented 9 months ago

if you haven't tried why do you propose us to use it?

spitfire commented 9 months ago

The gpu accel piper works. Same build process as previous, simply clone my wyoming-addons-gpu fork checkout the build_piper branch and run docker compose -f docker-compose.gpu.yml up and it should build it for you.

Trying your fork, I ended up with this error:

ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-6' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.10/dist-packages/wyoming/server.py:28> exception=FileNotFoundError(2, 'No such file or directory')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/wyoming/server.py", line 35, in run
    if not (await self.handle_event(event)):
  File "/usr/local/lib/python3.10/dist-packages/wyoming_piper/handler.py", line 98, in handle_event
    wav_file: wave.Wave_read = wave.open(output_path, "rb")
  File "/usr/lib/python3.10/wave.py", line 509, in open
    return Wave_read(f)
  File "/usr/lib/python3.10/wave.py", line 159, in __init__
    f = builtins.open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: ''

same here for piper

additionally whisper which used to work fine now instead of recognizing what I said ("zamknij wszystkie markizy") spews stuff like:

wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Dzięki za oglądanie.
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Nie zapomnijcie zasubskrybować oraz zafollowować mnie na Facebooku!
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Ziemia  Sl fick
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Dzięki za oglądanie!
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Nie zapomnijcie zasubskrybować oraz zafollowować mnie na Facebooku!
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Dzięki za oglądanie i zapraszam na mój kanał.
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Kupy, kupy, kupy, kupy, kupy, kupy, kupy.
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Napisy stworzone przez społeczność Amara.org
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Cześć!  Cześć!  Cześć!

Which is "don't forget to subscribe and follow me on FB" and "thanks for watching" (among others, which I won't even comment) when using medium-int8 model for Polish.

WTF, did they mess up the model, or is it some paywalled version.

juan11perez commented 9 months ago

Good day, the below compose utilising this docker image works for me. It's using the GPU.

  whisper: 
    container_name: whisper
    image: ghcr.io/slackr31337/wyoming-whisper-gpu:latest
    restart: unless-stopped
    hostname: UNRAID
    network_mode: appdata_mynet
    ports:
    - 10300:10300
    volumes:
    - /mnt/cache/appdata/homeassistant/wyoming/whisper:/data
    env_file: secrets/.env
    command: --model medium-int8 --language en --device cuda
    depends_on:
    - homeassistant    
    runtime: nvidia    
    deploy:
      resources:
        reservations:
          devices:
          - capabilities: ["gpu"]
            device_ids: ["GPU-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]  
edurenye commented 8 months ago

If you get a ¨Circular reference¨ error, you need to upgrade Docker compose to version 2.24.6, since this has been addressed there https://github.com/docker/compose/pull/11470

spitfire commented 8 months ago

If you get a ¨Circular reference¨ error, you need to upgrade Docker compose to version 2.24.6, since this has been addressed there docker/compose#11470

does that mean the error I'm seeing above? If so I'll have to wait a moment to hit the repos, just after apt update && apt upgrade and it's not on docker's debian repos yet.

vekexasia commented 8 months ago

Hello guys, just discovered this PR. I have a srv intel based (13500T)

I guess it would be impossible for me to run it as It is not an nvidia gpu right? I'd love to add one on my srv but i dont think i can fit any gpu in my 1U miniitx srv . Suggestions are welcome:

image

baudneo commented 8 months ago

Hello guys, just discovered this PR. I have a srv intel based (13500T)

I guess it would be impossible for me to run it as It is not an nvidia gpu right? I'd love to add one on my srv but i dont think i can fit any gpu in my 1U miniitx srv . Suggestions are welcome:

image

Correct, it is CUDA accelerated only as of this moment, meaning Nvidia GPUs.

If you have a PCIe slot and power for a gpu, you could use a pcie extender and house the GPU on the outside of the case. I do this with an r540 and a 1660ti.

edurenye commented 8 months ago

If you get a ¨Circular reference¨ error, you need to upgrade Docker compose to version 2.24.6, since this has been addressed there docker/compose#11470

does that mean the error I'm seeing above? If so I'll have to wait a moment to hit the repos, just after apt update && apt upgrade and it's not on docker's debian repos yet.

No, not the same error. That error that you mention seems to just happen in that fork. I'm using the branch of this PR and was working for me until I upgdated Docker, but as I said the Docker error should be fixed by the next Docker release 2.24.6, the other option right now is to downgrade to some version below 2.24 like 2.23.3

spitfire commented 8 months ago

No, not the same error. That error that you mention seems to just happen in that fork. I'm using the branch of this PR and was working for me until I upgdated Docker, but as I said the Docker error should be fixed by the next Docker release 2.24.6, the other option right now is to downgrade to some version below 2.24 like 2.23.3

thanks for clarification

The gpu accel piper works. Same build process as previous, simply clone my wyoming-addons-gpu fork checkout the build_piper branch and run docker compose -f docker-compose.gpu.yml up and it should build it for you.

Trying your fork, I ended up with this error:

ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-6' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.10/dist-packages/wyoming/server.py:28> exception=FileNotFoundError(2, 'No such file or directory')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/wyoming/server.py", line 35, in run
    if not (await self.handle_event(event)):
  File "/usr/local/lib/python3.10/dist-packages/wyoming_piper/handler.py", line 98, in handle_event
    wav_file: wave.Wave_read = wave.open(output_path, "rb")
  File "/usr/lib/python3.10/wave.py", line 509, in open
    return Wave_read(f)
  File "/usr/lib/python3.10/wave.py", line 159, in __init__
    f = builtins.open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: ''

same here for piper

additionally whisper which used to work fine now instead of recognizing what I said ("zamknij wszystkie markizy") spews stuff like:

wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Dzięki za oglądanie.
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Nie zapomnijcie zasubskrybować oraz zafollowować mnie na Facebooku!
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Ziemia  Sl fick
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Dzięki za oglądanie!
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Nie zapomnijcie zasubskrybować oraz zafollowować mnie na Facebooku!
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Dzięki za oglądanie i zapraszam na mój kanał.
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Kupy, kupy, kupy, kupy, kupy, kupy, kupy.
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Napisy stworzone przez społeczność Amara.org
wyoming-whisper-gpu       | INFO:wyoming_faster_whisper.handler: Cześć!  Cześć!  Cześć!

Which is "don't forget to subscribe and follow me on FB" and "thanks for watching" (among others, which I won't even comment) when using medium-int8 model for Polish.

WTF, did they mess up the model, or is it some paywalled version.

@baudneo any idea how to fix the issues I'm seeing?