collabora / WhisperFusion

WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
1.45k stars 101 forks source link

AttributeError: '_Runtime' object has no attribute 'address' on Ubuntu with T4 GPU #32

Open LeoHsu0802 opened 5 months ago

LeoHsu0802 commented 5 months ago

I try to run this project on AWS EC2 g4dn.xlarge with T4 GPU and I got AttributeError as below

==========
== CUDA ==
==========

CUDA Version 12.2.2

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

done loading
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
/usr/local/lib/python3.10/dist-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
[02/06/2024-02:23:28] [TRT] [E] 6: The engine plan file is generated on an incompatible device, expecting compute 7.5 got compute 8.9, please rebuild.
[02/06/2024-02:23:29] [TRT] [E] 2: [engine.cpp::deserializeEngine::1148] Error Code 2: Internal Error (Assertion engine->deserialize(start, size, allocator, runtime) failed. )
Process Process-3:
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/root/WhisperFusion/llm_service.py", line 195, in run
    self.initialize_model(
  File "/root/WhisperFusion/llm_service.py", line 109, in initialize_model
    self.runner = self.runner_cls.from_dir(**self.runner_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/model_runner.py", line 417, in from_dir
    session = session_cls(model_config,
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 475, in __init__
    self.runtime = _Runtime(engine_buffer, mapping)
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 153, in __init__
    self.__prepare(mapping, engine_buffer)
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 174, in __prepare
    assert self.engine is not None
AssertionError
Exception ignored in: <function _Runtime.__del__ at 0x7fa97eebb5b0>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/runtime/generation.py", line 279, in __del__
    cudart.cudaFree(self.address)  # FIXME: cudaFree is None??
AttributeError: '_Runtime' object has no attribute 'address'
zoq commented 5 months ago

[02/06/2024-02:23:28] [TRT] [E] 6: The engine plan file is generated on an incompatible device, expecting compute 7.5 got compute 8.9, please rebuild.

The provided image wasn't build for compute 7.5, you would have to rebuild it. See https://github.com/collabora/WhisperFusion?tab=readme-ov-file#build-docker-image for further details.

Let us know if you run into any issues, we can build the image for 7.5 as well and push it to the registry.

JanRiedel commented 4 months ago

I have the same issue (on a RTX 6000) and it is not clear for me, how to build a new image with that instructions: "bash build.sh 86-real". Please could you explain?

JanRiedel commented 4 months ago

sorry, now I foud the build.sh file...WhisperFusion/docker/