wenet-e2e / wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Apache License 2.0
707 stars 116 forks source link

Got AttributeError: 'RecursiveScriptModule' object has no attribute 'get_speech_timestamps' when running wespeaker diarization #335

Closed ryanwong00 closed 3 months ago

ryanwong00 commented 3 months ago

Hi all,

I am new to wespeaker. I followed the README doc to do the installation: pip install git+https://github.com/wenet-e2e/wespeaker.git

When I run the diarization: wespeaker --task diarization --audio_file ...

Some packages were still missing. I then installed them one by one: pip install pyyaml requests scipy scikit-learn pysoundfile

After I installed them all, I got the following error:

(venv) C:\temp\dev>wespeaker --task diarization --audio_file c:\temp\test.wav 
WARNING:root:unexpected tensor: projection.weight
Traceback (most recent call last):
  File "C:\temp\dev\venv\Scripts\wespeaker-script.py", line 33, in <module>
    sys.exit(load_entry_point('wespeaker==0.0.0', 'console_scripts', 'wespeaker')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\temp\dev\venv\Lib\site-packages\wespeaker\cli\speaker.py", line 346, in main
    diar_result = model.diarize(args.audio_file)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\temp\dev\venv\Lib\site-packages\wespeaker\cli\speaker.py", line 220, in diarize
    vad_segments = self.vad.get_speech_timestamps(audio_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\temp\dev\venv\Lib\site-packages\torch\jit\_script.py", line 823, in __getattr__
    return super().__getattr__(attr)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\temp\dev\venv\Lib\site-packages\torch\jit\_script.py", line 530, in __getattr__
    return super().__getattr__(attr)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\temp\dev\venv\Lib\site-packages\torch\nn\modules\module.py", line 1709, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'RecursiveScriptModule' object has no attribute 'get_speech_timestamps'

I am running in Windows 10, python 3.11.3. Any one can help?

Thanks in advance, Ryan

qdhansonlin commented 3 months ago

open speaker.py,change "def diarize": def diarize(self, audio_path: str, utt: str = "unk"): pcm, sample_rate = torchaudio.load(audio_path, normalize=False)

change

    wav = read_audio(audio_path)
    # 1. vad
    vad_segments = get_speech_timestamps(wav, self.vad, return_seconds=True)

I think vad update to v5.0 causing this bug.

ryanwong00 commented 3 months ago

It's working now. Thanks a lot!