microsoft / UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech
Other
406 stars 71 forks source link

error when pip install and cannot import name 'Wav2VecModel' from 'fairseq.models.wav2vec #26

Open leijue222 opened 2 years ago

leijue222 commented 2 years ago

The base conda env is: conda create -n unispeech python=3.8 when I run pip install --require-hashes -r requirements.txt under speaker_verification. The error will appear:

Collecting numpy<1.23.0,>=1.16.5
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==. These do not:
    numpy<1.23.0,>=1.16.5 from https://files.pythonhosted.org/packages/2f/14/abc14a3f3663739e5d3c8fd980201d10788d75fea5b0685734227052c4f0/numpy-1.22.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=64f56fc53a2d18b1924abd15745e30d82a5782b2cab3429aceecc6875bd5add0 (from scipy==1.7.1->-r requirements.txt (line 1))

Then I installed the environment manually.

leijue222 commented 2 years ago

When I run python verification.py --model_name unispeech_sat --wav1 vox1_data/David_Faustino/hn8GyCJIfLM_0000012.wav --wav2 vox1_data/Josh_Gad/HXUqYaOwrxA_0000015.wav --checkpoint $checkpoint_path The latest torchaudio is not compatible:

  File "/root/miniconda3/envs/unispeech/lib/python3.8/site-packages/s3prl/upstream/baseline/preprocessor.py", line 18, in <module>
    from torchaudio.functional import magphase, compute_deltas
ImportError: cannot import name 'magphase' from 'torchaudio.functional' (/root/miniconda3/envs/unispeech/lib/python3.8/site-packages/torchaudio/functional/__init__.py)

I can run it after downgrading it to version 0.9.1. But the new error is:

  File "/root/miniconda3/envs/unispeech/lib/python3.8/site-packages/s3prl/upstream/wav2vec/expert.py", line 19, in <module>
    from fairseq.models.wav2vec import Wav2VecModel
ImportError: cannot import name 'Wav2VecModel' from 'fairseq.models.wav2vec' (/apdcephfs/private_yiweiding/project/UniSpeech/src/fairseq/models/wav2vec/__init__.py)
leijue222 commented 2 years ago

python verification.py --model_name ecapa_tdnn --wav1 vox1_data/David_Faustino/hn8GyCJIfLM_0000012.wav --wav2 vox1_data/Josh_Gad/HXUqYaOwrxAcle_0000015.wav --checkpoint checkpoint/ecapa-tdnn.pth will get the same error:

from fairseq.models.wav2vec import Wav2VecModel
ImportError: cannot import name 'Wav2VecModel' from 'fairseq.models.wav2vec'
Sanyuan-Chen commented 2 years ago

Hi @leijue222 ,

Could you replace this line https://github.com/microsoft/UniSpeech/blob/e3043e2021d49429a406be09b9b8432febcdec73/downstreams/speaker_verification/models/ecapa_tdnn.py#L196 with self.feature_extract = torch.hub.load('s3prl/s3prl:e52439edaeb1a443e82960e6401ae6ab4241def6', feat_type) and try again?

I found the latest version of s3prl code would raise ImportError, but the older version can just skip this error.

bryant0918 commented 1 year ago

I am getting this same error but with the WavLM model. After inserting the above line:

self.feature_extract = torch.hub.load('s3prl/s3prl:e52439edaeb1a443e82960e6401ae6ab4241def6', feat_type)

I get the following error: raise ValueError(f'Cannot find {branch} in https://github.com/{repo_owner}/{repo_name}. ' ValueError: Cannot find e52439edaeb1a443e82960e6401ae6ab4241def6 in https://github.com/s3prl/s3prl. If it's a commit from a forked repo, please call hub.load() with forked repo directly.

Mine worked without that addition for the UniSpeech-SAT model, but does yours work with the WamLM-Large model @leijue222 @Sanyuan-Chen ?

Which earlier version was that that worked?

Thanks

Sanyuan-Chen commented 1 year ago

Hi @bryant0918 ,

I can successfully call this function and the logs are shown as below:

Python 3.8.0 (default, Nov  6 2019, 21:49:08)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> feature_extract  = torch.hub.load('s3prl/s3prl:e52439edaeb1a443e82960e6401ae6ab4241def6', 'wavlm')
Downloading: "https://github.com/s3prl/s3prl/archive/e52439edaeb1a443e82960e6401ae6ab4241def6.zip" to /home/user/.cache/torch/hub/e52439edaeb1a443e82960e6401ae6ab4241def6.zip
################################################################################
### WARNING, path does not exist: KALDI_ROOT=/mnt/matylda5/iveselyk/Tools/kaldi-trunk
###          (please add 'export KALDI_ROOT=<your_path>' in your $HOME/.profile)
###          (or run as: KALDI_ROOT=<your_path> python <your_script>.py)
################################################################################

Importing the dtw module. When using in academic works please cite:
  T. Giorgino. Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package.
  J. Stat. Soft., doi:10.18637/jss.v031.i07.

[s3prl.downstream.experts] Warning: can not import s3prl.downstream.a2a-vc-vctk.expert: No module named 'resemblyzer'. Pass.
--2022-07-15 07:12:52--  https://msranlcmtteamdrive.blob.core.windows.net/share/wavlm/WavLM-Base+.pt?sv=2020-04-08&st=2021-11-05T00%3A34%3A47Z&se=2022-10-06T00%3A34%3A00Z&sr=b&sp=r&sig=Gkf1IByHaIn1t%2FVEd9D6WHjZ3zu%2Fk5eSdoj21UytKro%3D
Resolving msranlcmtteamdrive.blob.core.windows.net (msranlcmtteamdrive.blob.core.windows.net)... 52.239.152.234
Connecting to msranlcmtteamdrive.blob.core.windows.net (msranlcmtteamdrive.blob.core.windows.net)|52.239.152.234|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 377604817 (360M) [application/octet-stream]
Saving to: ‘/home/user/.cache/torch/hub/s3prl_cache/9befb105e1905436d79a980ef7d0be3ace1676978cda0cb1261e9d7e474f11d1’

/home/user/.cache/torch/hub/s3prl_cache/9befb105e1905436d79a980ef7 100%[==================================================================================================================================================================>] 360.11M  31.2MB/s    in 10s

2022-07-15 07:13:02 (35.5 MB/s) - ‘/home/user/.cache/torch/hub/s3prl_cache/9befb105e1905436d79a980ef7d0be3ace1676978cda0cb1261e9d7e474f11d1’ saved [377604817/377604817]

>>>

The corresponding version is https://github.com/s3prl/s3prl/tree/e52439edaeb1a443e82960e6401ae6ab4241def6

bryant0918 commented 1 year ago

Okay thank you It's still not working for me. I think it's an issue with pytorch. I see you're using python=3.8 and the requirements.txt says to use torchaudio=0.9.0. But what version of Pytorch are you using?

bryant0918 commented 1 year ago

I am now able to find the branch with torch=1.9.0+cu111 and python=3.8.13. However, now I get the following error:

File "C:\Users\bryan\anaconda3\envs\ss\lib\site-packages\torch\serialization.py", line 777, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
EOFError: Ran out of input

According to this thread. It appears that this is because it is trying to load an empty file. Right before the error is thrown

print(f.read()) b' '

I tried with different version in the following line with no success even though when I manually download it from the site it's not empty. self.feature_extract = torch.hub.load('s3prl/s3prl:<version>', 'wavlm_large')

Any idea @Sanyuan-Chen