FunAudioLLM / SenseVoice

Multilingual Voice Understanding Model
https://funaudiollm.github.io/
Other
2.61k stars 249 forks source link

修改哪些位置可以使其在torch2.4.0环境下正常运行 #94

Open LuWu9 opened 1 month ago

LuWu9 commented 1 month ago

❓ Questions and Help

What is your question?

如何在torch 2.4.0环境下使用sensevoice 我正在做一个纯本地运行的语音对话的项目,使用SenseVoice+LLM+GPT-SoVITS来达到语音对语音交流的效果,其中部分模块必须要使用torch2.4.0,所以求大佬帮忙看下SenseVoice这个项目怎么样才能在torch2.4.0环境下正常运行 异常见下图: image ①运行在纯净的虚拟环境下,仅将torch升级为2.4.0,python版本3.11.2; ②torchaudio的版本与torch同步; ③尝试过修改weights_only=True,除了少了几行提示其他没有区别)

Code

from funasr import AutoModel
from funasr.utils.postprocess_utils import rich_transcription_postprocess

import time

class ASR:
    def __init__(self):
        # 本地模型路径
        model_dir = "./models/SenseVoiceSmall"
        vad_model_dir = "./models/speech_fsmn_vad_zh-cn-16k-common-pytorch"

        self.model = AutoModel(
            model=model_dir,
            trust_remote_code=True,
            remote_code="./src/SenseVoice/model.py",  
            vad_model=vad_model_dir,
            vad_kwargs={"max_single_segment_time": 30000},
            device="cuda:0",
            )

    def audio2text(self, filepath):
        res = self.model.generate(
            input=filepath,
            cache={},
            language="zh",
            use_itn=True,
            batch_size_s=60,
            merge_vad=True,
            merge_length_s=15,
            )
        text = rich_transcription_postprocess(res[0]["text"])

        return text

if __name__ == "__main__":
    filepath = "./.cache/Keira.wav"
    a = ASR()
    start_time = time.time()
    text = a.audio2text(filepath)
    print("{0}({1:.2f}s)".format(text, time.time()-start_time))

What have you tried?

使用虚拟环境pip install -r requirements.txt后可以正常输出(除torch和torchaudio版本外,环境与上方异常情况相同) image

What's your environment?

Package                Version
---------------------- ------------
aiofiles               23.2.1
aliyun-python-sdk-core 2.15.1
aliyun-python-sdk-kms  2.16.3
annotated-types        0.7.0
antlr4-python3-runtime 4.9.3
anyio                  4.4.0
audioread              3.0.1
certifi                2024.7.4
cffi                   1.16.0
charset-normalizer     3.3.2
click                  8.1.7
colorama               0.4.6
contourpy              1.2.1
crcmod                 1.7
cryptography           43.0.0
cycler                 0.12.1
decorator              5.1.1
editdistance           0.8.1
fastapi                0.112.0
ffmpy                  0.4.0
filelock               3.15.4
fonttools              4.53.1
fsspec                 2024.6.1
funasr                 1.1.4
gradio                 4.40.0
gradio_client          1.2.0
h11                    0.14.0
httpcore               1.0.5
httpx                  0.27.0
huggingface            0.0.1
huggingface-hub        0.24.5
hydra-core             1.3.2
idna                   3.7
importlib_resources    6.4.0
intel-openmp           2021.4.0
jaconv                 0.4.0
jamo                   0.4.1
jieba                  0.42.1
Jinja2                 3.1.4
jmespath               0.10.0
joblib                 1.4.2
kaldiio                2.18.0
kiwisolver             1.4.5
lazy_loader            0.4
librosa                0.10.2.post1
llvmlite               0.43.0
markdown-it-py         3.0.0
MarkupSafe             2.1.5
matplotlib             3.9.1
mdurl                  0.1.2
mkl                    2021.4.0
modelscope             1.17.0
mpmath                 1.3.0
msgpack                1.0.8
networkx               3.3
numba                  0.60.0
numpy                  1.26.4
omegaconf              2.3.0
orjson                 3.10.6
oss2                   2.18.6
packaging              24.1
pandas                 2.2.2
pillow                 10.4.0
pip                    22.3.1
platformdirs           4.2.2
pooch                  1.8.2
protobuf               5.27.3
pycparser              2.22
pycryptodome           3.20.0
pydantic               2.8.2
pydantic_core          2.20.1
pydub                  0.25.1
Pygments               2.18.0
pynndescent            0.5.13
pyparsing              3.1.2
python-dateutil        2.9.0.post0
python-multipart       0.0.9
pytorch-wpe            0.0.1
pytz                   2024.1
PyYAML                 6.0.1
requests               2.32.3
rich                   13.7.1
ruff                   0.5.6
scikit-learn           1.5.1
scipy                  1.14.0
semantic-version       2.10.0
sentencepiece          0.2.0
setuptools             65.5.0
shellingham            1.5.4
six                    1.16.0
sniffio                1.3.1
soundfile              0.12.1
soxr                   0.4.0
starlette              0.37.2
sympy                  1.13.1
tbb                    2021.13.0
tensorboardX           2.6.2.2
threadpoolctl          3.5.0
tomlkit                0.12.0
torch                  2.4.0
torch-complex          0.4.4
torchaudio             2.4.0
tqdm                   4.66.5
typer                  0.12.3
typing_extensions      4.12.2
tzdata                 2024.1
umap-learn             0.5.6
urllib3                2.2.2
uvicorn                0.30.5
websockets             12.0