facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.22k stars 6.38k forks source link

Hubert Inference Error : TypeError: object of type 'NoneType' has no len() #4100

Closed EmreOzkose closed 2 years ago

EmreOzkose commented 2 years ago

What is your question?

How can I use Hubert checkpoint for inference ?

I trained Hubert model on an arabic data. I followed simple_kmeans folder instructions to extract features and .km files. Then I trained a huberet model with this code:

python fairseq_cli/hydra_train.py \
  --config-dir /path/to/fairseq_latest/examples/hubert/config/pretrain \
  --config-name hubert_base_librispeech \
  task.data=/path/to/experiments/exp3_arabic/tsv \
  task.label_dir=/path/to/experiments/exp3_arabic/km \
  task.labels='["km"]' model.label_rate=100

Now I am trying to use chekpoint for inference with this code:

python examples/speech_recognition/new/infer.py \
  --config-dir /path/to/fairseq_latest/examples/hubert/config/decode \
  --config-name infer_viterbi \
  task.data=/path/to/experiments/exp3_arabic/tsv/sub_42 \
  task.normalize=false \
  decoding.results_path=/path/to/experiments/exp3_arabic/decoding \
  common_eval.path=/path/to/fairseq_latest/None/checkpoints/checkpoint_266_400000.pt \
  dataset.gen_subset=valid

However I am facing with this:

Traceback (most recent call last):                                                                                                                                                                                 
  File "examples/speech_recognition/new/infer.py", line 432, in hydra_main
    distributed_utils.call_main(cfg, main)
  File "/path/to/fairseq_latest/fairseq/distributed/utils.py", line 369, in call_main
    main(cfg, **kwargs)
  File "examples/speech_recognition/new/infer.py", line 383, in main
    processor.process_sample(sample)
  File "examples/speech_recognition/new/infer.py", line 305, in process_sample
    hypos = self.task.inference_step(
  File "/path/to/fairseq_latest/fairseq/tasks/fairseq_task.py", line 537, in inference_step
    return generator.generate(
  File "/path/to/fairseq_latest/examples/speech_recognition/new/decoders/base_decoder.py", line 37, in generate
    emissions = self.get_emissions(models, encoder_input)
  File "/path/to/fairseq_latest/examples/speech_recognition/new/decoders/base_decoder.py", line 47, in get_emissions
    encoder_out = model(**encoder_input)
  File "/path/to//miniconda3/envs/fairseq/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/path/to/fairseq_latest/fairseq/models/hubert/hubert.py", line 467, in forward
    proj_x_m_list = proj_x_m.chunk(len(target_list), dim=-1)
TypeError: object of type 'NoneType' has no len()

target_list comes None. How can I solve this?

My exp dir is Screenshot from 2021-12-28 14-42-48

I don't know if it is a correct way exactly, but I copy valid.km file as valid.ltr.

What's your environment?

fairseq : 1.0.0a0+0dfd6b6

[pip3] numpy==1.21.4 [pip3] torch==1.10.0 [pip3] torchaudio==0.10.0 [pip3] torchvision==0.9.0 [conda] blas 1.0 mkl
[conda] cudatoolkit 11.1.1 h6406543_8 conda-forge [conda] ffmpeg 4.3 hf484d3e_0 pytorch [conda] mkl 2020.2 256
[conda] mkl-service 2.3.0 py38h1e0a361_2 conda-forge [conda] mkl_fft 1.2.0 py38hab2c0dc_1 conda-forge [conda] mkl_random 1.2.0 py38hc5bc63f_1 conda-forge [conda] numpy 1.21.4 pypi_0 pypi [conda] torch 1.10.0 pypi_0 pypi [conda] torchaudio 0.10.0 pypi_0 pypi [conda] torchvision 0.9.0 py38_cu111 pytorch

EmreOzkose commented 2 years ago

Note that I also changed arg decoding.exp_dir to decoding.results_path, because of this error:

Traceback (most recent call last):
  File "examples/speech_recognition/new/infer.py", line 471, in <module>
    cli_main()
  File "examples/speech_recognition/new/infer.py", line 467, in cli_main
    hydra_main()  # pylint: disable=no-value-for-parameter
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main
    _run_hydra(
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 346, in _run_hydra
    run_and_report(
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
    raise ex
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 347, in <lambda>
    lambda: hydra.run(
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 100, in run
    cfg = self.compose_config(
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 507, in compose_config
    cfg = self.config_loader.load_configuration(
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 151, in load_configuration
    return self._load_configuration(
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 277, in _load_configuration
    ConfigLoaderImpl._apply_overrides_to_config(config_overrides, cfg)
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 515, in _apply_overrides_to_config
    raise ConfigCompositionException(
hydra.errors.ConfigCompositionException: Could not override 'decoding.exp_dir'.
To append to your config use +decoding.exp_dir=/path/to/experiments/exp3_arabic/decoding
EmreOzkose commented 2 years ago

Is target_list eqaul to target in variable sample ? If yes, when I pass target as target_list , I got this error:

Traceback (most recent call last):                                                                                                                                                                                 
  File "examples/speech_recognition/new/infer.py", line 474, in <module>
    cli_main()
  File "examples/speech_recognition/new/infer.py", line 470, in cli_main
    hydra_main()  # pylint: disable=no-value-for-parameter
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main
    _run_hydra(
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 346, in _run_hydra
    run_and_report(
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
    raise ex
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/utils.py", line 347, in <lambda>
    lambda: hydra.run(
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 107, in run
    return run_job(
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/hydra/core/utils.py", line 129, in run_job
    ret.return_value = task_function(task_cfg)
  File "examples/speech_recognition/new/infer.py", line 435, in hydra_main
    distributed_utils.call_main(cfg, main)
  File "/path/to/fairseq_latest/fairseq/distributed/utils.py", line 369, in call_main
    main(cfg, **kwargs)
  File "examples/speech_recognition/new/infer.py", line 386, in main
    processor.process_sample(sample)
  File "examples/speech_recognition/new/infer.py", line 305, in process_sample
    hypos = self.task.inference_step(
  File "/path/to/fairseq_latest/fairseq/tasks/fairseq_task.py", line 537, in inference_step
    return generator.generate(
  File "/path/to/fairseq_latest/examples/speech_recognition/new/decoders/base_decoder.py", line 39, in generate
    emissions = self.get_emissions(models, encoder_input, sample["target"])
  File "/path/to/fairseq_latest/examples/speech_recognition/new/decoders/base_decoder.py", line 53, in get_emissions
    encoder_out = model(**encoder_input, target_list=target_list)
  File "/path/to/miniconda3/envs/fairseq/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/path/to/fairseq_latest/fairseq/models/hubert/hubert.py", line 416, in forward
    features, target_list = self.forward_targets(features, target_list)
  File "/path/to/fairseq_latest/fairseq/models/hubert/hubert.py", line 379, in forward_targets
    targ_tsz = min([t.size(1) for t in target_list])
  File "/path/to/fairseq_latest/fairseq/models/hubert/hubert.py", line 379, in <listcomp>
    targ_tsz = min([t.size(1) for t in target_list])
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
EmreOzkose commented 2 years ago

ın addition to that, I tried to load checkpoint and run as :

import os
import torch
import fairseq
import soundfile as sf

ckpt_path = "checkpoints/checkpoint_266_400000.pt"
models, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task([ckpt_path])
model = models[0]
model.eval()

wav_path = "01695.wav"
wav_input_16khz, _ = sf.read(wav_path)
wav_input_16khz = torch.Tensor(wav_input_16khz).unsqueeze(0)

output = model(wav_input_16khz)

Model is loaded and forward pass is done without an error. However output is

{'logit_m_list': [None],
 'logit_u_list': [None],
 'padding_mask': None,
 'features_pen': tensor(7.0502e-11, grad_fn=<MeanBackward0>)}
EmreOzkose commented 2 years ago

Is it possible to conclude that the trained model is not trained well? I guess, at least, model is assumed to generate some random outputs...

My loss graph is Screenshot from 2021-12-28 17-08-48

stale[bot] commented 2 years ago

This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!

stale[bot] commented 2 years ago

Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!

mo979 commented 1 year ago

The model you used in decode step comes from the pre-train step. After I fine-tune the pre-trained pt model, and use the fine-tuned model into decode step, it runs successfully.

EmreOzkose commented 1 year ago

Actually I don't even remember when I faced with this problem :), but thank you for your answer. I hope it will help other people.