collabora / WhisperFusion

WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
1.55k stars 111 forks source link

Version build for rtx3090 - not working #31

Closed rvsh2 closed 5 months ago

rvsh2 commented 9 months ago

I wanted to test WhisperFusions on rtx3090.

Went throught build.sh and then: docker run --gpus all --shm-size 64G -p 6006:6006 -p 8888:8888 -it ghcr.io/collabora/whisperfusion:latest and I run http server as in description

after that got this:

==========
== CUDA ==
==========

CUDA Version 12.2.2

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

/usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/usr/local/lib/python3.10/dist-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
INFO:root:[LLM INFO:] Loaded LLM TensorRT Engine.
/usr/local/lib/python3.10/dist-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
[2024-02-05 11:51:46,152] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
INFO:websockets.server:connection open
downloading ONNX model...
loading session
loading onnx model
reset states
INFO:root:[Whisper INFO:] New client connected

[2024-02-05 11:52:27,316] [1/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[2024-02-05 11:52:27,324] [1/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[2024-02-05 11:52:30,266] [1/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[2024-02-05 11:52:30,278] [1/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
 |████████████████████████████████████████████████████████████████████████████████████████████████████| 100.00% [152/152 00:00<00:00]
rvsh@bob:/opt/WhisperFusion/examples/chatbot/html$ python3 -m http.server
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
127.0.0.1 - - [05/Feb/2024 12:55:49] "GET / HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:56:10] "GET / HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:56:23] "GET / HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:56:23] "GET / HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:56:23] "GET / HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:56:27] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:56:27] "GET /css/style.css HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:56:27] "GET /css/all.min.css HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:56:27] "GET /img/Collabora_Logo.svg HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:56:27] "GET /img/microphone-white.png HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:56:27] "GET /img/stop.png HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:56:27] "GET /img/record.png HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:56:27] "GET /js/main.js HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:56:27] code 404, message File not found
127.0.0.1 - - [05/Feb/2024 12:56:27] "GET /favicon.ico HTTP/1.1" 404 -
127.0.0.1 - - [05/Feb/2024 12:56:33] "GET /img/microphone-hover.png HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:56:33] "GET /img/microphone.png HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:57:05] "GET /css/all.min.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:57:05] "GET /css/style.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:57:05] "GET /js/main.js HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:57:05] "GET /img/microphone-white.png HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:57:05] "GET /img/Collabora_Logo.svg HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:57:05] "GET /img/stop.png HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:57:05] "GET /img/record.png HTTP/1.1" 304 -
127.0.0.1 - - [05/Feb/2024 12:57:07] "GET /img/microphone-hover.png HTTP/1.1" 200 -
127.0.0.1 - - [05/Feb/2024 12:57:07] "GET /img/microphone.png HTTP/1.1" 200 -

on the 127.0.0.1:8000 there is server visible but clicking microphone just run the timer and nothing else is going on.

Where is the problem?

zoq commented 9 months ago

The server side has a warmup stage, so it takes about 30 seconds until the server is available. For the output I can see the client connected before the server was fully running.

INFO:root:[Whisper INFO:] New client connected

before:

 |████████████████████████████████████████████████████████████████████████████████████████████████████| 100.00% [152/152 00:00<00:00]

we are going to improve the console output, so this is more clear. In the meantime can you wait until you see the progress bar and then connect a new client?

rchyf0516 commented 9 months ago

@zoq I encountered a similar situation. After running docker run --gpus all --shm-size 64G -p 6006:6006 -p 8888:8888 -it ghcr.io/collabora/whisperfusion:latest and waiting until the logs like this screenshot-20240206-164800

I executed python -m http.server. on the 127.0.0.1:8000 there is server visible but clicking microphone just run the timer and nothing else is going on. One difference is that there were no updates in the docker run logs.

What could be the issue? Thanks!

rvsh2 commented 9 months ago
rvsh@bob:~$ docker run --gpus all --shm-size 64G -p 6006:6006 -p 8888:8888 -it ghcr.io/collabora/whisperfusion-3090:latest

==========
== CUDA ==
==========

CUDA Version 12.2.2

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
INFO:root:[LLM] loaded: True
/usr/local/lib/python3.10/dist-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
[2024-02-06 08:39:43,304] [0/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[2024-02-06 08:40:32,493] [1/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[2024-02-06 08:40:32,504] [1/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[2024-02-06 08:40:37,120] [1/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[2024-02-06 08:40:37,144] [1/0] torch._dynamo.variables.torch: [WARNING] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
INFO:websockets.server:connection open████████████████████████████████████████████████████████████████| 100.00% [152/152 00:00<00:00]
INFO:websockets.server:connection open
downloading ONNX model...
loading session
loading onnx model
reset states
INFO:root:New client connected

ERROR:root:received 1001 (going away); then sent 1001 (going away)
INFO:root:Cleaning up.
INFO:root:Connection Closed.
INFO:root:{}
INFO:root:Exiting speech to text thread
rvsh@bob:/opt/WhisperFusion/examples/chatbot/html$ python3 -m http.server
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
127.0.0.1 - - [06/Feb/2024 09:41:18] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [06/Feb/2024 09:41:18] "GET /css/all.min.css HTTP/1.1" 304 -
127.0.0.1 - - [06/Feb/2024 09:41:18] "GET /css/style.css HTTP/1.1" 304 -
127.0.0.1 - - [06/Feb/2024 09:41:18] "GET /img/microphone-white.png HTTP/1.1" 304 -
127.0.0.1 - - [06/Feb/2024 09:41:18] "GET /img/Collabora_Logo.svg HTTP/1.1" 304 -
127.0.0.1 - - [06/Feb/2024 09:41:18] "GET /img/stop.png HTTP/1.1" 304 -
127.0.0.1 - - [06/Feb/2024 09:41:18] "GET /img/record.png HTTP/1.1" 304 -
127.0.0.1 - - [06/Feb/2024 09:41:18] "GET /js/main.js HTTP/1.1" 304 -
127.0.0.1 - - [06/Feb/2024 09:41:20] "GET /img/microphone-hover.png HTTP/1.1" 200 -
127.0.0.1 - - [06/Feb/2024 09:41:20] "GET /js/audio-processor.js HTTP/1.1" 200 -
127.0.0.1 - - [06/Feb/2024 09:41:25] "GET /img/microphone.png HTTP/1.1" 200 -
127.0.0.1 - - [06/Feb/2024 09:44:09] "GET / HTTP/1.1" 200 -
----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 54538)
Traceback (most recent call last):
  File "/usr/lib/python3.10/socketserver.py", line 683, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib/python3.10/http/server.py", line 1304, in finish_request
    self.RequestHandlerClass(request, client_address, self,
  File "/usr/lib/python3.10/http/server.py", line 668, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/lib/python3.10/socketserver.py", line 747, in __init__
    self.handle()
  File "/usr/lib/python3.10/http/server.py", line 433, in handle
    self.handle_one_request()
  File "/usr/lib/python3.10/http/server.py", line 421, in handle_one_request
    method()
  File "/usr/lib/python3.10/http/server.py", line 675, in do_GET
    self.copyfile(f, self.wfile)
  File "/usr/lib/python3.10/http/server.py", line 875, in copyfile
    shutil.copyfileobj(source, outputfile)
  File "/usr/lib/python3.10/shutil.py", line 198, in copyfileobj
    fdst_write(buf)
  File "/usr/lib/python3.10/socketserver.py", line 826, in write
    self._sock.sendall(b)
BrokenPipeError: [Errno 32] Broken pipe
----------------------------------------
127.0.0.1 - - [06/Feb/2024 09:44:09] "GET /css/style.css HTTP/1.1" 200 -
----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 54554)
Traceback (most recent call last):
  File "/usr/lib/python3.10/socketserver.py", line 683, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib/python3.10/http/server.py", line 1304, in finish_request
    self.RequestHandlerClass(request, client_address, self,
  File "/usr/lib/python3.10/http/server.py", line 668, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/lib/python3.10/socketserver.py", line 747, in __init__
    self.handle()
  File "/usr/lib/python3.10/http/server.py", line 433, in handle
    self.handle_one_request()
  File "/usr/lib/python3.10/http/server.py", line 421, in handle_one_request
    method()
  File "/usr/lib/python3.10/http/server.py", line 675, in do_GET
    self.copyfile(f, self.wfile)
  File "/usr/lib/python3.10/http/server.py", line 875, in copyfile
    shutil.copyfileobj(source, outputfile)
  File "/usr/lib/python3.10/shutil.py", line 198, in copyfileobj
    fdst_write(buf)
  File "/usr/lib/python3.10/socketserver.py", line 826, in write
    self._sock.sendall(b)
BrokenPipeError: [Errno 32] Broken pipe
----------------------------------------
127.0.0.1 - - [06/Feb/2024 09:44:10] "GET /img/microphone.png HTTP/1.1" 200 -
----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 54560)
Traceback (most recent call last):
  File "/usr/lib/python3.10/socketserver.py", line 683, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib/python3.10/http/server.py", line 1304, in finish_request
    self.RequestHandlerClass(request, client_address, self,
  File "/usr/lib/python3.10/http/server.py", line 668, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/lib/python3.10/socketserver.py", line 747, in __init__
    self.handle()
  File "/usr/lib/python3.10/http/server.py", line 433, in handle
    self.handle_one_request()
  File "/usr/lib/python3.10/http/server.py", line 421, in handle_one_request
    method()
  File "/usr/lib/python3.10/http/server.py", line 675, in do_GET
    self.copyfile(f, self.wfile)
  File "/usr/lib/python3.10/http/server.py", line 875, in copyfile
    shutil.copyfileobj(source, outputfile)
  File "/usr/lib/python3.10/shutil.py", line 198, in copyfileobj
    fdst_write(buf)
  File "/usr/lib/python3.10/socketserver.py", line 826, in write
    self._sock.sendall(b)
BrokenPipeError: [Errno 32] Broken pipe
----------------------------------------
127.0.0.1 - - [06/Feb/2024 09:44:45] "GET / HTTP/1.1" 304 -
screen

Tried also prebuild version. I waited for the progress bar but still the same results. It looks like it is connected but not working.

zoq commented 9 months ago

We are currently putting everything into one docker container, and will release it tomorrow.

zoq commented 9 months ago

We updated the Dockerfile, that now includes the webserver:

docker run --gpus all --shm-size 64G -p 8080:80 -it ghcr.io/collabora/whisperfusion:latest

after that put the browser to localhost:8080, or whatever the ip of the docker container is.

rvsh2 commented 9 months ago

Thanks for update. I tried this but got some errors:

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service legacy-services: starting
services-up: info: copying legacy longrun nginx (no readiness notification)
services-up: info: copying legacy longrun whisperfusion (no readiness notification)
s6-rc: info: service legacy-services successfully started
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/_common.py", line 58, in _init
    torch.classes.load_library(ft_decoder_lib)
  File "/usr/local/lib/python3.10/dist-packages/torch/_classes.py", line 51, in load_library
    torch.ops.load_library(path)
  File "/usr/local/lib/python3.10/dist-packages/torch/_ops.py", line 933, in load_library
    ctypes.CDLL(path)
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libth_common.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/WhisperFusion/main.py", line 11, in <module>
    from whisper_live.trt_server import TranscriptionServer
  File "/root/WhisperFusion/whisper_live/trt_server.py", line 17, in <module>
    from whisper_live.trt_transcriber import WhisperTRTLLM
  File "/root/WhisperFusion/whisper_live/trt_transcriber.py", line 16, in <module>
    import tensorrt_llm
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/__init__.py", line 64, in <module>
    _init(log_level="error")
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/_common.py", line 61, in _init
    raise ImportError(str(e) + msg)
ImportError: /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libth_common.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev
FATAL: Decoding operators failed to load. This may be caused by the incompatibility between PyTorch and TensorRT-LLM. Please rebuild and install TensorRT-LLM.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/_common.py", line 58, in _init
    torch.classes.load_library(ft_decoder_lib)
  File "/usr/local/lib/python3.10/dist-packages/torch/_classes.py", line 51, in load_library
    torch.ops.load_library(path)
  File "/usr/local/lib/python3.10/dist-packages/torch/_ops.py", line 933, in load_library
    ctypes.CDLL(path)
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libth_common.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev

Please advice.

surajwiseyak commented 7 months ago

@rvsh2 @zoq I am also facing the same issue. No activity in web ui and logs after starting microphone.

On the single docker image, I am also getting undefuned symbol error, likely due to incompatibility between Pytorch and TensorRT-LLM.

Any updates?

makaveli10 commented 7 months ago

@surajwiseyak which gpu are you using? Do you use docker-compose setup?

makaveli10 commented 5 months ago

With the latest changes Whisperfusion works on the 3090 as expected on both linux and WSL2