modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
6.87k stars 727 forks source link

KeyError: 'vad-inference is not in the pipelines registry group auto-speech-recognition. Please make sure the correct version of ModelScope library is used.' #805

Closed Shivansh-yadav13 closed 1 year ago

Shivansh-yadav13 commented 1 year ago

I'm new to this project

I installed it in conda env using the docs https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html

I was using this

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

inference_pipeline = pipeline(
    task=Tasks.auto_speech_recognition,
    model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch',
    )
import soundfile
speech, sample_rate = soundfile.read("example/asr_example.wav")

param_dict = {"in_cache": dict(), "is_final": False}
chunk_stride = 1600# 100ms
# first chunk, 100ms
speech_chunk = speech[0:chunk_stride] 
rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
print(rec_result)
# next chunk, 480ms
speech_chunk = speech[chunk_stride:chunk_stride+chunk_stride]
rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
print(rec_result)

but I got this error

2023-08-04 17:20:37,339 - modelscope - INFO - PyTorch version 2.0.1 Found.
2023-08-04 17:20:37,341 - modelscope - INFO - Loading ast index from C:\Users\Admin\.cache\modelscope\ast_indexer
2023-08-04 17:20:37,459 - modelscope - INFO - Loading done! Current index file version is 1.8.1, with md5 2c63428b79490a55cf41e802cac8b49b and a total number of 893 components indexed
2023-08-04 17:20:40,571 - modelscope - INFO - Model revision not specified, use the latest revision: v1.2.0
2023-08-04 17:20:41,127 - modelscope - WARNING - ('PIPELINES', 'auto-speech-recognition', 'vad-inference') not found in ast index file
Traceback (most recent call last):
  File "N:\funny_asr\main.py", line 4, in <module>
    inference_pipeline = pipeline(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\modelscope\pipelines\builder.py", line 147, in pipeline
    return build_pipeline(cfg, task_name=task)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\modelscope\pipelines\builder.py", line 59, in build_pipeline
    return build_from_cfg(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\modelscope\utils\registry.py", line 198, in build_from_cfg
    raise KeyError(
KeyError: 'vad-inference is not in the pipelines registry group auto-speech-recognition. Please make sure the correct version of ModelScope library is used.'

ENV: OS: Win11 Python: 3.8 pip list

absl-py 1.4.0 addict 2.4.0 aiohttp 3.8.5 aiosignal 1.3.1 aliyun-python-sdk-core 2.13.36 aliyun-python-sdk-kms 2.16.1 aniso8601 9.0.1 annotated-types 0.5.0 appdirs 1.4.4 async-timeout 4.0.2 attrs 23.1.0 audioread 3.0.0 cachetools 5.3.1 certifi 2023.7.22 cffi 1.15.1 charset-normalizer 3.2.0 click 8.0.4 colorama 0.4.6 coloredlogs 14.0 crcmod 1.7 cryptography 41.0.3 datasets 2.13.0 decorator 5.1.1 dill 0.3.6 Distance 0.1.3 dnspython 2.4.1 edit-distance 1.0.6 editdistance 0.6.2 einops 0.6.1 espnet-tts-frontend 0.0.3 et-xmlfile 1.1.0 eventlet 0.33.3 filelock 3.12.2 Flask 2.1.3 Flask-Cors 4.0.0 Flask-RESTful 0.3.10 Flask-SocketIO 4.3.2 flask-talisman 1.1.0 frozenlist 1.4.0 fsspec 2023.6.0 funasr 0.7.1 g2p 1.1.20230511 g2p-en 2.1.0 gast 0.5.4 google-auth 2.22.0 google-auth-oauthlib 1.0.0 greenlet 2.0.2 grpcio 1.56.2 h5py 3.9.0 huggingface-hub 0.16.4 humanfriendly 10.0 idna 3.4 importlib-metadata 6.8.0 inflect 7.0.0 itsdangerous 2.1.2 jaconv 0.3.4 jamo 0.4.1 jieba 0.42.1 Jinja2 3.1.2 jmespath 0.10.0 joblib 1.3.1 kaldiio 2.18.0 lazy_loader 0.3 librosa 0.10.0.post2 llvmlite 0.40.1 Markdown 3.4.4 MarkupSafe 2.1.3 modelscope 1.8.1 mpmath 1.3.0 msgpack 1.0.5 multidict 6.0.4 multiprocess 0.70.14 munkres 1.1.4 networkx 2.8.4 nltk 3.8.1 numba 0.57.1 numpy 1.22.0 oauthlib 3.2.2 openpyxl 3.1.2 oss2 2.18.1 packaging 23.1 pandas 1.3.5 panphon 0.20.0 Pillow 10.0.0 pip 23.2.1 platformdirs 3.10.0 pooch 1.6.0 protobuf 4.23.4 pyarrow 12.0.1 pyasn1 0.5.0 pyasn1-modules 0.3.0 pycparser 2.21 pycryptodome 3.18.0 pydantic 2.1.1 pydantic_core 2.4.0 pypinyin 0.49.0 pyreadline3 3.4.1 python-dateutil 2.8.2 python-engineio 3.14.2 python-socketio 4.6.1 pytorch-wpe 0.0.1 pytz 2023.3 PyYAML 6.0.1 regex 2023.6.3 requests 2.31.0 requests-oauthlib 1.3.1 rsa 4.9 scikit-learn 1.3.0 scipy 1.10.1 sentencepiece 0.1.99 setuptools 68.0.0 simplejson 3.19.1 six 1.16.0 sortedcontainers 2.4.0 soundfile 0.12.1 soxr 0.3.5 sympy 1.12 tensorboard 2.13.0 tensorboard-data-server 0.7.1 text-unidecode 1.3 TextGrid 1.5 threadpoolctl 3.2.0 tomli 2.0.1 torch 2.0.1 torch-complex 0.4.3 torchaudio 2.0.2 tqdm 4.65.0 typing_extensions 4.7.1 unicodecsv 0.14.1 Unidecode 1.3.6 urllib3 1.26.16 Werkzeug 2.0.3 wheel 0.38.4 xxhash 3.3.0 yapf 0.40.1 yarl 1.9.2 zipp 3.16.2

LauraGPT commented 1 year ago

Please details your envs by: OS: [e.g. linux] Python/C++ Version: Package Version:pytorch、torchaudio、modelscope、funasr version(pip list) Model: Command: Details: Error log:

LauraGPT commented 1 year ago

Sorry, mistakes of docs has been fixed. https://github.com/alibaba-damo-academy/FunASR/commit/a2a874b403f9c050fb608cb07842de02ba852008

Shivansh-yadav13 commented 1 year ago

@langgz

running

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

inference_pipeline = pipeline(
    task=Tasks.voice_activity_detection,
    model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch',
)

segments_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav')
print(segments_result)

I got this do you what could be the issue?

requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
LauraGPT commented 1 year ago

@langgz

running

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

inference_pipeline = pipeline(
    task=Tasks.voice_activity_detection,
    model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch',
)

segments_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav')
print(segments_result)

I got this do you what could be the issue?

requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

Please detail the logs completely.

Shivansh-yadav13 commented 1 year ago
2023-08-04 18:29:54,626 - modelscope - INFO - PyTorch version 2.0.1 Found.
2023-08-04 18:29:54,629 - modelscope - INFO - Loading ast index from C:\Users\Admin\.cache\modelscope\ast_indexer
2023-08-04 18:29:54,807 - modelscope - INFO - Loading done! Current index file version is 1.8.1, with md5 2c63428b79490a55cf41e802cac8b49b and a total number of 893 components indexed
2023-08-04 18:29:58,918 - modelscope - INFO - Model revision not specified, use the latest revision: v1.2.0
2023-08-04 18:29:59,624 - modelscope - INFO - initiate model from C:\Users\Admin\.cache\modelscope\hub\damo\speech_fsmn_vad_zh-cn-16k-common-pytorch
2023-08-04 18:29:59,625 - modelscope - INFO - initiate model from location C:\Users\Admin\.cache\modelscope\hub\damo\speech_fsmn_vad_zh-cn-16k-common-pytorch.
2023-08-04 18:29:59,658 - modelscope - INFO - initialize model from C:\Users\Admin\.cache\modelscope\hub\damo\speech_fsmn_vad_zh-cn-16k-common-pytorch
2023-08-04 18:29:59,756 - modelscope - WARNING - No preprocessor field found in cfg.
2023-08-04 18:29:59,756 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-08-04 18:29:59,756 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\\Users\\Admin\\.cache\\modelscope\\hub\\damo\\speech_fsmn_vad_zh-cn-16k-common-pytorch'}. trying to build by task and model information.
2023-08-04 18:29:59,756 - modelscope - WARNING - No preprocessor key ('generic-asr', 'voice-activity-detection') found in PREPROCESSOR_MAP, skip building preprocessor.
2023-08-04 18:29:59,786 - modelscope - INFO - cuda is not available, using cpu instead.
Traceback (most recent call last):
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\connectionpool.py", line 714, in urlopen
    httplib_response = self._make_request(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\connectionpool.py", line 403, in _make_request
    self._validate_conn(conn)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\connectionpool.py", line 1053, in _validate_conn
    conn.connect()
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\connection.py", line 419, in connect
    self.sock = ssl_wrap_socket(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\util\ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\util\ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\ssl.py", line 1040, in _create
    self.do_handshake()
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\requests\adapters.py", line 486, in send
    resp = conn.urlopen(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\connectionpool.py", line 798, in urlopen
    retries = retries.increment(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\util\retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\packages\six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\connectionpool.py", line 714, in urlopen
    httplib_response = self._make_request(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\connectionpool.py", line 403, in _make_request
    self._validate_conn(conn)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\connectionpool.py", line 1053, in _validate_conn
    conn.connect()
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\connection.py", line 419, in connect
    self.sock = ssl_wrap_socket(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\util\ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\urllib3\util\ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\ssl.py", line 1040, in _create
    self.do_handshake()
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "N:\funny_asr\main.py", line 10, in <module>
    segments_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav')
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\modelscope\pipelines\audio\voice_activity_detection_pipeline.py", line 114, in __call__
    self.audio_in, self.raw_inputs = generate_scp_from_url(audio_in)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\modelscope\utils\audio\audio_utils.py", line 225, in generate_scp_from_url
    data = storage.read(url)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\modelscope\fileio\file.py", line 129, in read
    r = requests.get(url)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\requests\api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\requests\api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\requests\sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\requests\sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\requests\adapters.py", line 501, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
LauraGPT commented 1 year ago

Maybe you could download the model to local path via git lfs. Ref to docs

Shivansh-yadav13 commented 1 year ago

@langgz

Now I'm using the local path

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

local_dir_root = "./speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
inference_pipeline = pipeline(
    task=Tasks.voice_activity_detection,
    model=local_dir_root,
)

segments_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/vad_example.wav')
print(segments_result)

but now I'm getting this error

2023-08-05 13:12:10,503 - modelscope - INFO - PyTorch version 2.0.1 Found.
2023-08-05 13:12:10,504 - modelscope - INFO - Loading ast index from C:\Users\Admin\.cache\modelscope\ast_indexer
2023-08-05 13:12:10,612 - modelscope - INFO - Loading done! Current index file version is 1.8.1, with md5 2c63428b79490a55cf41e802cac8b49b and a total number of 893 components indexed
2023-08-05 13:12:11,970 - modelscope - WARNING - ('PIPELINES', 'voice-activity-detection', 'asr-inference') not found in ast index file
Traceback (most recent call last):
  File "N:\funny_asr\main.py", line 5, in <module>
    inference_pipeline = pipeline(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\modelscope\pipelines\builder.py", line 147, in pipeline
    return build_pipeline(cfg, task_name=task)
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\modelscope\pipelines\builder.py", line 59, in build_pipeline
    return build_from_cfg(
  File "C:\Users\Admin\anaconda3\envs\funasr\lib\site-packages\modelscope\utils\registry.py", line 198, in build_from_cfg
    raise KeyError(
KeyError: 'asr-inference is not in the pipelines registry group voice-activity-detection. Please make sure the correct version of ModelScope library is used.'

also @langgz I want to implement this https://github.com/alibaba-damo-academy/FunASR/discussions/804 can it do that?

LauraGPT commented 1 year ago

804

It is a demo to demonstrate how to infer from local model. You should change the task and model name to which you want to use. For example,

git clone https://www.modelscope.cn/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch.git
local_dir_root = "./speech_fsmn_vad_zh-cn-16k-common-pytorch"
inference_pipeline = pipeline(
    task=Tasks.voice_activity_detection,
    model=local_dir_root,
)

import soundfile
speech, sample_rate = soundfile.read("{}/example/asr_example.wav".format(local_dir_root))

param_dict = {"in_cache": dict(), "is_final": False}
chunk_stride = 1600# 100ms
# first chunk, 100ms
speech_chunk = speech[0:chunk_stride] 
rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
print(rec_result)
# next chunk, 480ms
speech_chunk = speech[chunk_stride:chunk_stride+chunk_stride]
rec_result = inference_pipeline(audio_in=speech_chunk, param_dict=param_dict)
print(rec_result)