Closed MonolithFoundation closed 2 weeks ago
Please update funasr-1.1.12:
https://github.com/modelscope/FunASR/tree/main/examples/industrial_data_pretraining/whisper
thanks for the quick response!
s exceeded with url: /funasr/ (Caused by SSLError(SSLError(1, '[SSL] record layer failure (_ssl.c:1006)'))) - skipping ERROR: Could not find a version that satisfies the requirement funasr==1.1.12 (from versions: 0.3.1, 0.4.1, 0.4.2, 0.4.3, 0.4.4, 0.4.6, 0.4.7, 0.4.8, 0.5.0, 0.5.1, 0.5.2, 0.5.3, 0.5.4, 0.5.5, 0.5.6, 0.5.8, 0.6.0, 0.6.1, 0.6.2, 0.6.3, 0.6.4, 0.6.5, 0.6.6, 0.6.7, 0.6.9, 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.7.4, 0.7.5, 0.7.6, 0.7.7, 0.7.8, 0.7.9, 0.8.0, 0.8.1, 0.8.2, 0.8.3, 0.8.4, 0.8.6, 0.8.7, 0.8.8, 1.0.0, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.0.6, 1.0.7, 1.0.8, 1.0.9, 1.0.10, 1.0.11, 1.0.12, 1.0.14, 1.0.15, 1.0.16, 1.0.17, 1.0.18, 1.0.19, 1.0.20, 1.0.21, 1.0.22, 1.0.23, 1.0.24, 1.0.25, 1.0.26, 1.0.27, 1.0.28, 1.0.29, 1.0.30, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.1.8, 1.1.9, 1.1.11) ERROR: No matching distribution found for funasr==1.1.12
after Installed from git:
Please update funasr again and re-try it: https://github.com/modelscope/FunASR/commit/cd684580991661b9a088361bea2d7f00735178d3
after Installed from git:
- Authentication token does not exist, failed to access model Whisper-large-v3-turbo which may not exist or may be private. Please login first.
modelscope login --token YOUR_MODELSCOPE_SDK_TOKEN
You can get the SDK token on Home page, https://modelscope.cn/my/myaccesstoken.
Hi, how to deal with this error anyway:
File "/tests/test_speakersep.py", line 97, in get_asr_spk res = self.model.generate( ^^^^^^^^^^^^^^^^^^^^ File "/FunASR/funasr/auto/auto_model.py", line 303, in generate return self.inference_with_vad(input, input_len=input_len, **cfg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/FunASR/funasr/auto/auto_model.py", line 553, in inference_with_vad sv_output = postprocess(all_segments, None, labels, spk_embedding.cpu()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "FunASR/funasr/models/campplus/utils.py", line 117, in postprocess assert len(segments) == len(labels) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError 0%|
Hello, anyone would like help this out? Currently the WhipserTurbo is not stable at alll
just use it follow demos, any other usages are not supported now: https://github.com/modelscope/FunASR/tree/main/examples/industrial_data_pretraining/whisper
The labels and segments not equal should because of this? vad_kwargs={"max_single_segment_time": 30000},
Why doesn't support speaker for whisper?
Hi, how to deal with this error anyway:
File "/tests/test_speakersep.py", line 97, in get_asr_spk res = self.model.generate( ^^^^^^^^^^^^^^^^^^^^ File "/FunASR/funasr/auto/auto_model.py", line 303, in generate return self.inference_with_vad(input, input_len=input_len, **cfg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/FunASR/funasr/auto/auto_model.py", line 553, in inference_with_vad sv_output = postprocess(all_segments, None, labels, spk_embedding.cpu()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "FunASR/funasr/models/campplus/utils.py", line 117, in postprocess assert len(segments) == len(labels) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError 0%|
same for me
Why doesn't support speaker for whisper?
Whisper models lack timestamps for speaker recognition.
Why doesn't support speaker for whisper?
Whisper models lack timestamps for speaker recognition.
the latest turbo version could be made to predict timestamps according to their model card from huggingface.co.
Why doesn't support speaker for whisper?
Whisper models lack timestamps for speaker recognition.
the latest turbo version could be made to predict timestamps according to their model card from huggingface.co.
The timesptamp of whisper is sentence-level. However, the timestamp of speaker recognition should be word-level. If you are interest in that, maybe you could do it by yourself.
Hi, if we using vad model first?
Why doesn't support speaker for whisper?
Whisper models lack timestamps for speaker recognition.
the latest turbo version could be made to predict timestamps according to their model card from huggingface.co.
The timesptamp of whisper is sentence-level. However, the timestamp of speaker recognition should be word-level. If you are interest in that, maybe you could do it by yourself.
thanks, very impressive
Hi, still can not understand, why speaker recognition must be word level?
Support Whisper-v3-large-turbo