k2-fsa / k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.
https://k2-fsa.github.io/k2
Apache License 2.0
1.11k stars 213 forks source link

Colab problem installing k2 #1054

Open danpovey opened 2 years ago

danpovey commented 2 years ago

A problem installing k2 in colab: https://colab.research.google.com/drive/15FSAIx7dND2xcZW9ZOZmffhbgmIH2zNS?usp=sharing

! pip install torch==1.7.1+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch==1.7.1+cu110
  Downloading https://download.pytorch.org/whl/cu110/torch-1.7.1%2Bcu110-cp37-cp37m-linux_x86_64.whl (1156.8 MB)
     |███████████████████████         | 834.1 MB 1.3 MB/s eta 0:04:10tcmalloc: large alloc 1147494400 bytes == 0x39c94000 @  0x7ff0d4023615 0x592b76 0x4df71e 0x59afff 0x515655 0x549576 0x593fce 0x548ae9 0x51566f 0x549576 0x593fce 0x548ae9 0x5127f1 0x598e3b 0x511f68 0x598e3b 0x511f68 0x598e3b 0x511f68 0x4bc98a 0x532e76 0x594b72 0x515600 0x549576 0x593fce 0x548ae9 0x5127f1 0x549576 0x593fce 0x5118f8 0x593dd7
     |█████████████████████████████▏  | 1055.7 MB 1.1 MB/s eta 0:01:29tcmalloc: large alloc 1434370048 bytes == 0x7e2ea000 @  0x7ff0d4023615 0x592b76 0x4df71e 0x59afff 0x515655 0x549576 0x593fce 0x548ae9 0x51566f 0x549576 0x593fce 0x548ae9 0x5127f1 0x598e3b 0x511f68 0x598e3b 0x511f68 0x598e3b 0x511f68 0x4bc98a 0x532e76 0x594b72 0x515600 0x549576 0x593fce 0x548ae9 0x5127f1 0x549576 0x593fce 0x5118f8 0x593dd7
     |████████████████████████████████| 1156.7 MB 1.1 MB/s eta 0:00:01tcmalloc: large alloc 1445945344 bytes == 0xd3ad6000 @  0x7ff0d4023615 0x592b76 0x4df71e 0x59afff 0x515655 0x549576 0x593fce 0x511e2c 0x549576 0x593fce 0x511e2c 0x549576 0x593fce 0x511e2c 0x549576 0x593fce 0x511e2c 0x549576 0x593fce 0x511e2c 0x593dd7 0x511e2c 0x549576 0x593fce 0x548ae9 0x5127f1 0x549576 0x593fce 0x548ae9 0x5127f1 0x549576
     |████████████████████████████████| 1156.8 MB 14 kB/s 
Collecting torchaudio==0.7.2
  Downloading torchaudio-0.7.2-cp37-cp37m-manylinux1_x86_64.whl (7.6 MB)
     |████████████████████████████████| 7.6 MB 17.6 MB/s 
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch==1.7.1+cu110) (4.1.1)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from torch==1.7.1+cu110) (1.21.6)
Installing collected packages: torch, torchaudio
  Attempting uninstall: torch
    Found existing installation: torch 1.12.1+cu113
    Uninstalling torch-1.12.1+cu113:
      Successfully uninstalled torch-1.12.1+cu113
  Attempting uninstall: torchaudio
    Found existing installation: torchaudio 0.12.1+cu113
    Uninstalling torchaudio-0.12.1+cu113:
      Successfully uninstalled torchaudio-0.12.1+cu113
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.13.1+cu113 requires torch==1.12.1, but you have torch 1.7.1+cu110 which is incompatible.
torchtext 0.13.1 requires torch==1.12.1, but you have torch 1.7.1+cu110 which is incompatible.
Successfully installed torch-1.7.1+cu110 torchaudio-0.7.2
Install k2
[ ]
! pip install k2
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting k2
  Downloading k2-1.19-py37-none-any.whl (72.8 MB)
     |████████████████████████████████| 72.8 MB 161 kB/s 
Requirement already satisfied: graphviz in /usr/local/lib/python3.7/dist-packages (from k2) (0.10.1)
Requirement already satisfied: torch==1.7.1 in /usr/local/lib/python3.7/dist-packages (from k2) (1.7.1+cu110)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from torch==1.7.1->k2) (1.21.6)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch==1.7.1->k2) (4.1.1)
Installing collected packages: k2
Successfully installed k2-1.19
Check that k2 was installed successfully:

[ ]
! python3 -m k2.version
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.7/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/usr/local/lib/python3.7/dist-packages/k2/__init__.py", line 19, in <module>
    f"k2 was built using CUDA {k2_torch_cuda_version}\n"
ImportError: k2 was built using CUDA 10.1
But you are using CUDA 11.0 to run it.
marcoyang1998 commented 2 years ago

It's because the CUDA versions used by k2(10.1) and PyTorch(11.0) are different. You can either try:

  1. replace pip install torch==1.7.1+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html with pip install torch==1.7.1+cu101 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html or
  2. replace pip install k2 with pip install k2==1.19.dev20220831+cuda11.0.torch1.7.1 -f http://k2-fsa.org/nightly/ --trusted-host k2-fsa.org
danpovey commented 2 years ago

thanks!

danpovey commented 2 years ago

(was my wife's question. closing the issue).

npovey commented 2 years ago

CTC decoding stage wants me to install lhotse: ModuleNotFoundError: No module named 'lhotse'

  1. Added ! pip install lhotse and then got the following error:

command:

! cd icefall/egs/librispeech/ASR && \
    PYTHONPATH=/content/icefall python3 ./conformer_ctc/pretrained.py \
      --method ctc-decoding \
      --checkpoint ./tmp/icefall_asr_librispeech_conformer_ctc/exp/pretrained.pt \
      --lang-dir ./tmp/icefall_asr_librispeech_conformer_ctc/data/lang_bpe \
      ./tmp/icefall_asr_librispeech_conformer_ctc/test_wavs/1089-134686-0001.flac \
      ./tmp/icefall_asr_librispeech_conformer_ctc/test_wavs/1221-135766-0001.flac \
      ./tmp/icefall_asr_librispeech_conformer_ctc/test_wavs/1221-135766-0002.flac

output:

/usr/local/lib/python3.7/dist-packages/torchaudio/backend/utils.py:54: UserWarning: "sox" backend is being deprecated. The default backend will be changed to "sox_io" backend in 0.8.0 and "sox" backend will be removed in 0.9.0. Please migrate to "sox_io" backend. Please refer to https://github.com/pytorch/audio/issues/903 for the detail.
  '"sox" backend is being deprecated. '
usage: pretrained.py [-h] --checkpoint CHECKPOINT [--words-file WORDS_FILE]
                     [--HLG HLG] [--bpe-model BPE_MODEL] [--method METHOD]
                     [--G G] [--num-paths NUM_PATHS]
                     [--ngram-lm-scale NGRAM_LM_SCALE]
                     [--attention-decoder-scale ATTENTION_DECODER_SCALE]
                     [--nbest-scale NBEST_SCALE] [--sos-id SOS_ID]
                     [--num-classes NUM_CLASSES] [--eos-id EOS_ID]
                     sound_files [sound_files ...]
pretrained.py: error: unrecognized arguments: --lang-dir
  1. Changed --lang-dir to --bpe-model

After above 2 fixes I am getting this error now:

/usr/local/lib/python3.7/dist-packages/torchaudio/backend/utils.py:54: UserWarning: "sox" backend is being deprecated. The default backend will be changed to "sox_io" backend in 0.8.0 and "sox" backend will be removed in 0.9.0. Please migrate to "sox_io" backend. Please refer to https://github.com/pytorch/audio/issues/903 for the detail.
  '"sox" backend is being deprecated. '
2022-09-09 22:55:53,500 INFO [pretrained.py:259] {'sample_rate': 16000, 'subsampling_factor': 4, 'vgg_frontend': False, 'use_feat_batchnorm': True, 'feature_dim': 80, 'nhead': 8, 'attention_dim': 512, 'num_decoder_layers': 0, 'search_beam': 20, 'output_beam': 8, 'min_active_states': 30, 'max_active_states': 10000, 'use_double_scores': True, 'checkpoint': './tmp/icefall_asr_librispeech_conformer_ctc/exp/pretrained.pt', 'words_file': None, 'HLG': None, 'bpe_model': './tmp/icefall_asr_librispeech_conformer_ctc/data/lang_bpe', 'method': 'ctc-decoding', 'G': None, 'num_paths': 100, 'ngram_lm_scale': 1.3, 'attention_decoder_scale': 1.2, 'nbest_scale': 0.5, 'sos_id': 1, 'num_classes': 500, 'eos_id': 1, 'sound_files': ['./tmp/icefall_asr_librispeech_conformer_ctc/test_wavs/1089-134686-0001.flac', './tmp/icefall_asr_librispeech_conformer_ctc/test_wavs/1221-135766-0001.flac', './tmp/icefall_asr_librispeech_conformer_ctc/test_wavs/1221-135766-0002.flac']}
2022-09-09 22:55:53,533 INFO [pretrained.py:265] device: cuda:0
2022-09-09 22:55:53,533 INFO [pretrained.py:267] Creating model
Traceback (most recent call last):
  File "./conformer_ctc/pretrained.py", line 435, in <module>
    main()
  File "./conformer_ctc/pretrained.py", line 280, in main
    model.load_state_dict(checkpoint["model"], strict=False)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Conformer:
    size mismatch for encoder_output_layer.1.weight: copying a param with shape torch.Size([5000, 512]) from checkpoint, the shape in current model is torch.Size([500, 512]).
    size mismatch for encoder_output_layer.1.bias: copying a param with shape torch.Size([5000]) from checkpoint, the shape in current model is torch.Size([500]).

All decoding scripts are throwing "size mismatch" error

csukuangfj commented 2 years ago

For the size mismatch error, which pretrained model are you using?

csukuangfj commented 2 years ago

size mismatch for encoder_output_layer.1.bias: copying a param with shape torch.Size([5000]) from checkpoint, the shape in current model is torch.Size([500]).

It shows that your pretrained model has a vocab size of 500.


Changed --lang-dir to --bpe-model

Please clarify which bpe.model you are using. If you use a pre-trained model of vocab size 500, please use bpe.model from data/lang_bpe_500 or use the bpe.model that you downloaded from hugging face when you downloaded the pretrained model.

csukuangfj commented 2 years ago

I just updated the colab notebook at https://colab.research.google.com/drive/1huyupXAcHsUrKaWfI83iMEJ6J0Nh0213?usp=sharing

@npovey Could you try it again?

The colab notebook was quite outdated.

npovey commented 2 years ago

@csukuangfj It all works. Thanks!