Tele-AI / TeleSpeech-ASR

388 stars 37 forks source link

decode.sh运行报错 #3

Open lq0104 opened 1 month ago

lq0104 commented 1 month ago

感谢开源,请问decode.sh是用来进行ASR推理的吗?目前运行会有报错:

root@304:/home/code/TeleSpeech-ASR/data2vec_dialect# bash run_scripts/decode.sh /home/code/fairseq/fairseq/tasks/multires_hubert_pretraining.py:154: SyntaxWarning: "is not" with a literal. Did you mean "!="? dictionaries = [ (Dictionary.load(f"{label_dir}/dict.{label}.txt") if label is not "" else None ) for label in self.cfg.labels] INFO:main:import user_dir: /home/code/TeleSpeech-ASR/data2vec_dialect ERROR:main:Failed to import given user_module: /home/code/TeleSpeech-ASR/data2vec_dialect WARNING:main:Failed to get config name from hydra args [2024-05-27 03:17:08,000][main][INFO] - /home/model/audio/TeleSpeech-ASR/large.pt Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/hydra/_internal/utils.py", line 198, in run_and_report return func() File "/opt/conda/lib/python3.10/site-packages/hydra/_internal/utils.py", line 347, in lambda: hydra.run( File "/opt/conda/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 107, in run return run_job( File "/opt/conda/lib/python3.10/site-packages/hydra/core/utils.py", line 129, in run_job ret.return_value = task_function(task_cfg) File "/home/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 464, in hydra_main distributed_utils.call_main(cfg, main) File "/home/code/fairseq/fairseq/distributed/utils.py", line 404, in call_main main(cfg, **kwargs) File "/home/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 412, in main with InferenceProcessor(cfg) as processor: File "/home/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 107, in init models, saved_cfg = self.load_model_ensemble() File "/home/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 234, in load_model_ensemble models, saved_cfg = checkpoint_utils.load_model_ensemble( File "/home/code/fairseq/fairseq/checkpoint_utils.py", line 392, in load_model_ensemble ensemble, args, _task = load_model_ensemble_and_task( File "/home/code/fairseq/fairseq/checkpoint_utils.py", line 502, in load_model_ensemble_and_task model = task.build_model(cfg.model, from_checkpoint=True) File "/home/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_finetuning.py", line 260, in build_model model = super().build_model(model_cfg, from_checkpoint) File "/home/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_pretraining.py", line 227, in build_model model = super().build_model(model_cfg, from_checkpoint) File "/home/code/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model model = models.build_model(cfg, self, from_checkpoint) File "/home/code/fairseq/fairseq/models/init.py", line 102, in build_model "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys()) KeyError: "'_name'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 517, in cli_main() File "/home/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 513, in cli_main hydra_main() # pylint: disable=no-value-for-parameter File "/opt/conda/lib/python3.10/site-packages/hydra/main.py", line 32, in decorated_main _run_hydra( File "/opt/conda/lib/python3.10/site-packages/hydra/_internal/utils.py", line 346, in _run_hydra run_and_report( File "/opt/conda/lib/python3.10/site-packages/hydra/_internal/utils.py", line 267, in run_and_report print_exception(etype=None, value=ex, tb=final_tb) # type: ignore TypeError: print_exception() got an unexpected keyword argument 'etype'

跟fairseq的版本有关系吗? 目前我这边的操作:

$ git clone https://github.com/pytorch/fairseq $ cd fairseq $ pip install --editable ./

$ pip install kaldiio

$ bash run_scripts/decode.sh

decode.sh内容如下:

. ./path.sh || exit 1

data=/home/data/audio model=/home/model/audio/TeleSpeech-ASR/large.pt result_path=./decode_result python infer.py \ --config-dir config \ --config-name infer \ task=spec_finetuning \ task.data=${data} \ task.normalize=false \ common.user_dir=/home/code/TeleSpeech-ASR/data2vec_dialect \ common_eval.path=${model} \ common_eval.results_path=${result_path} \ common_eval.quiet=false \ dataset.gen_subset=train

/home/data/audio下是若干音频文件 large.pt from https://huggingface.co/Tele-AI/TeleSpeech-ASR1.0/blob/main/large.pt 麻烦帮忙看一下,非常感谢!

zorionginn commented 1 month ago

感谢开源,遇到了同样的问题,请问有人解决了么?

TTTdas commented 1 month ago

您好,目前开源的base.pt和large.pt是无监督预训练模型,不能支持直接解码使用,需要您使用本地数据进行微调或者作为特征提取器使用。新上传了一版利用kespeech开源数据集微调好的模型,该模型可以直接调用decode.sh解码使用,https://huggingface.co/Tele-AI/TeleSpeech-ASR1.0/tree/main

zorionginn commented 1 month ago

👍👍👍

Christmas-Wong commented 1 month ago
  1. 我换用了最新的finetune模型还是遇到了类似的问题
  2. decode.sh肯定是有问题的,仅仅传入了pt模型文件,没有传入字典文件,config/infer.yml中也没有字典的配置
    
    root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect# cat run_scripts/decode.sh
    . ./path.sh || exit 1

data=/data/data/telespeech/tele model=/data/models/TeleSpeech-ASR1.0/finetune_large_kespeech.pt result_path=/data/output/tele python3 infer.py \ --config-dir config \ --config-name infer \ task=spec_finetuning \ task.data=${data} \ task.normalize=false \ common.user_dir=/data/code/TeleSpeech-ASR/data2vec_dialect \ common_eval.path=${model} \ common_eval.results_path=${result_path} \ common_eval.quiet=false \ dataset.gen_subset=data root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect# bash run_scripts/decode.sh 2024-05-29 15:33:22 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX 2024-05-29 15:33:22 | INFO | main | import user_dir: /data/code/TeleSpeech-ASR/data2vec_dialect 2024-05-29 15:33:23 | ERROR | main | Failed to import given user_module: /data/code/TeleSpeech-ASR/data2vec_dialect 2024-05-29 15:33:23 | WARNING | main | Failed to get config name from hydra args [2024-05-29 15:33:24,153][main][INFO] - /data/models/TeleSpeech-ASR1.0/finetune_large_kespeech.pt Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 198, in run_and_report return func() File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 347, in lambda: hydra.run( File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py", line 107, in run return run_job( File "/usr/local/lib/python3.10/dist-packages/hydra/core/utils.py", line 129, in run_job ret.return_value = task_function(task_cfg) File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 464, in hydra_main distributed_utils.call_main(cfg, main) File "/data/code/fairseq/fairseq/distributed/utils.py", line 404, in call_main main(cfg, **kwargs) File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 412, in main with InferenceProcessor(cfg) as processor: File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 107, in init models, saved_cfg = self.load_model_ensemble() File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 234, in load_model_ensemble models, saved_cfg = checkpoint_utils.load_model_ensemble( File "/data/code/fairseq/fairseq/checkpoint_utils.py", line 392, in load_model_ensemble ensemble, args, _task = load_model_ensemble_and_task( File "/data/code/fairseq/fairseq/checkpoint_utils.py", line 502, in load_model_ensemble_and_task model = task.build_model(cfg.model, from_checkpoint=True) File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_finetuning.py", line 260, in build_model model = super().build_model(model_cfg, from_checkpoint) File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_pretraining.py", line 227, in build_model model = super().build_model(model_cfg, from_checkpoint) File "/data/code/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model model = models.build_model(cfg, self, from_checkpoint) File "/data/code/fairseq/fairseq/models/init.py", line 106, in build_model return model.build_model(cfg, task) File "/data/code/fairseq/fairseq/models/wav2vec/wav2vec2_asr.py", line 224, in build_model w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary)) File "/data/code/fairseq/fairseq/models/wav2vec/wav2vec2_asr.py", line 466, in init model = task.build_model(w2v_args.model, from_checkpoint=True) File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_pretraining.py", line 227, in build_model model = super().build_model(model_cfg, from_checkpoint) File "/data/code/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model model = models.build_model(cfg, self, from_checkpoint) File "/data/code/fairseq/fairseq/models/init.py", line 102, in build_model "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys()) KeyError: "'_name'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 517, in cli_main() File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 513, in cli_main hydra_main() # pylint: disable=no-value-for-parameter File "/usr/local/lib/python3.10/dist-packages/hydra/main.py", line 32, in decorated_main _run_hydra( File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 346, in _run_hydra run_and_report( File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 267, in run_and_report print_exception(etype=None, value=ex, tb=final_tb) # type: ignore TypeError: print_exception() got an unexpected keyword argument 'etype' root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect#

lostsollar commented 1 month ago
  1. 我换用了最新的finetune模型还是遇到了类似的问题
  2. decode.sh肯定是有问题的,仅仅传入了pt模型文件,没有传入字典文件,config/infer.yml中也没有字典的配置
root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect# cat run_scripts/decode.sh
. ./path.sh || exit 1

data=/data/data/telespeech/tele
model=/data/models/TeleSpeech-ASR1.0/finetune_large_kespeech.pt
result_path=/data/output/tele
python3 infer.py \
    --config-dir config \
    --config-name infer \
    task=spec_finetuning \
    task.data=${data} \
    task.normalize=false \
    common.user_dir=/data/code/TeleSpeech-ASR/data2vec_dialect \
    common_eval.path=${model} \
    common_eval.results_path=${result_path} \
    common_eval.quiet=false \
    dataset.gen_subset=data
root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect# bash run_scripts/decode.sh
2024-05-29 15:33:22 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2024-05-29 15:33:22 | INFO | __main__ | import user_dir: /data/code/TeleSpeech-ASR/data2vec_dialect
2024-05-29 15:33:23 | ERROR | __main__ | Failed to import given user_module: /data/code/TeleSpeech-ASR/data2vec_dialect
2024-05-29 15:33:23 | WARNING | __main__ | Failed to get config name from hydra args
[2024-05-29 15:33:24,153][__main__][INFO] - /data/models/TeleSpeech-ASR1.0/finetune_large_kespeech.pt
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 347, in <lambda>
    lambda: hydra.run(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py", line 107, in run
    return run_job(
  File "/usr/local/lib/python3.10/dist-packages/hydra/core/utils.py", line 129, in run_job
    ret.return_value = task_function(task_cfg)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 464, in hydra_main
    distributed_utils.call_main(cfg, main)
  File "/data/code/fairseq/fairseq/distributed/utils.py", line 404, in call_main
    main(cfg, **kwargs)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 412, in main
    with InferenceProcessor(cfg) as processor:
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 107, in __init__
    models, saved_cfg = self.load_model_ensemble()
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 234, in load_model_ensemble
    models, saved_cfg = checkpoint_utils.load_model_ensemble(
  File "/data/code/fairseq/fairseq/checkpoint_utils.py", line 392, in load_model_ensemble
    ensemble, args, _task = load_model_ensemble_and_task(
  File "/data/code/fairseq/fairseq/checkpoint_utils.py", line 502, in load_model_ensemble_and_task
    model = task.build_model(cfg.model, from_checkpoint=True)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_finetuning.py", line 260, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_pretraining.py", line 227, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "/data/code/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model
    model = models.build_model(cfg, self, from_checkpoint)
  File "/data/code/fairseq/fairseq/models/__init__.py", line 106, in build_model
    return model.build_model(cfg, task)
  File "/data/code/fairseq/fairseq/models/wav2vec/wav2vec2_asr.py", line 224, in build_model
    w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary))
  File "/data/code/fairseq/fairseq/models/wav2vec/wav2vec2_asr.py", line 466, in __init__
    model = task.build_model(w2v_args.model, from_checkpoint=True)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_pretraining.py", line 227, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "/data/code/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model
    model = models.build_model(cfg, self, from_checkpoint)
  File "/data/code/fairseq/fairseq/models/__init__.py", line 102, in build_model
    "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys())
KeyError: "'_name'"

+1,遇到了同样的问题

lq0104 commented 1 month ago

+1

TTTdas commented 1 month ago
  1. 我换用了最新的finetune模型还是遇到了类似的问题
  2. decode.sh肯定是有问题的,仅仅传入了pt模型文件,没有传入字典文件,config/infer.yml中也没有字典的配置
root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect# cat run_scripts/decode.sh
. ./path.sh || exit 1

data=/data/data/telespeech/tele
model=/data/models/TeleSpeech-ASR1.0/finetune_large_kespeech.pt
result_path=/data/output/tele
python3 infer.py \
    --config-dir config \
    --config-name infer \
    task=spec_finetuning \
    task.data=${data} \
    task.normalize=false \
    common.user_dir=/data/code/TeleSpeech-ASR/data2vec_dialect \
    common_eval.path=${model} \
    common_eval.results_path=${result_path} \
    common_eval.quiet=false \
    dataset.gen_subset=data
root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect# bash run_scripts/decode.sh
2024-05-29 15:33:22 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2024-05-29 15:33:22 | INFO | __main__ | import user_dir: /data/code/TeleSpeech-ASR/data2vec_dialect
2024-05-29 15:33:23 | ERROR | __main__ | Failed to import given user_module: /data/code/TeleSpeech-ASR/data2vec_dialect
2024-05-29 15:33:23 | WARNING | __main__ | Failed to get config name from hydra args
[2024-05-29 15:33:24,153][__main__][INFO] - /data/models/TeleSpeech-ASR1.0/finetune_large_kespeech.pt
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 347, in <lambda>
    lambda: hydra.run(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py", line 107, in run
    return run_job(
  File "/usr/local/lib/python3.10/dist-packages/hydra/core/utils.py", line 129, in run_job
    ret.return_value = task_function(task_cfg)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 464, in hydra_main
    distributed_utils.call_main(cfg, main)
  File "/data/code/fairseq/fairseq/distributed/utils.py", line 404, in call_main
    main(cfg, **kwargs)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 412, in main
    with InferenceProcessor(cfg) as processor:
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 107, in __init__
    models, saved_cfg = self.load_model_ensemble()
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 234, in load_model_ensemble
    models, saved_cfg = checkpoint_utils.load_model_ensemble(
  File "/data/code/fairseq/fairseq/checkpoint_utils.py", line 392, in load_model_ensemble
    ensemble, args, _task = load_model_ensemble_and_task(
  File "/data/code/fairseq/fairseq/checkpoint_utils.py", line 502, in load_model_ensemble_and_task
    model = task.build_model(cfg.model, from_checkpoint=True)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_finetuning.py", line 260, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_pretraining.py", line 227, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "/data/code/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model
    model = models.build_model(cfg, self, from_checkpoint)
  File "/data/code/fairseq/fairseq/models/__init__.py", line 106, in build_model
    return model.build_model(cfg, task)
  File "/data/code/fairseq/fairseq/models/wav2vec/wav2vec2_asr.py", line 224, in build_model
    w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary))
  File "/data/code/fairseq/fairseq/models/wav2vec/wav2vec2_asr.py", line 466, in __init__
    model = task.build_model(w2v_args.model, from_checkpoint=True)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_pretraining.py", line 227, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "/data/code/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model
    model = models.build_model(cfg, self, from_checkpoint)
  File "/data/code/fairseq/fairseq/models/__init__.py", line 102, in build_model
    "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys())
KeyError: "'_name'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 517, in <module>
    cli_main()
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 513, in cli_main
    hydra_main()  # pylint: disable=no-value-for-parameter
  File "/usr/local/lib/python3.10/dist-packages/hydra/main.py", line 32, in decorated_main
    _run_hydra(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 346, in _run_hydra
    run_and_report(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 267, in run_and_report
    print_exception(etype=None, value=ex, tb=final_tb)  # type: ignore
TypeError: print_exception() got an unexpected keyword argument 'etype'
root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect#

您好,根据报错信息来看,这里 2024-05-29 15:33:23 | ERROR | __main__ | Failed to import given user_module: /data/code/TeleSpeech-ASR/data2vec_dialect 并没有正确导入当前user_dir,从而导致fairseq的注册机制找不到finetune时使用的自定义model_class,触发了在build_model时的错误。请确认fairseq已正确安装,并且data2vec_dialect下的path.sh指定正确。

Christmas-Wong commented 1 month ago
  1. 我换用了最新的finetune模型还是遇到了类似的问题
  2. decode.sh肯定是有问题的,仅仅传入了pt模型文件,没有传入字典文件,config/infer.yml中也没有字典的配置
root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect# cat run_scripts/decode.sh
. ./path.sh || exit 1

data=/data/data/telespeech/tele
model=/data/models/TeleSpeech-ASR1.0/finetune_large_kespeech.pt
result_path=/data/output/tele
python3 infer.py \
    --config-dir config \
    --config-name infer \
    task=spec_finetuning \
    task.data=${data} \
    task.normalize=false \
    common.user_dir=/data/code/TeleSpeech-ASR/data2vec_dialect \
    common_eval.path=${model} \
    common_eval.results_path=${result_path} \
    common_eval.quiet=false \
    dataset.gen_subset=data
root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect# bash run_scripts/decode.sh
2024-05-29 15:33:22 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2024-05-29 15:33:22 | INFO | __main__ | import user_dir: /data/code/TeleSpeech-ASR/data2vec_dialect
2024-05-29 15:33:23 | ERROR | __main__ | Failed to import given user_module: /data/code/TeleSpeech-ASR/data2vec_dialect
2024-05-29 15:33:23 | WARNING | __main__ | Failed to get config name from hydra args
[2024-05-29 15:33:24,153][__main__][INFO] - /data/models/TeleSpeech-ASR1.0/finetune_large_kespeech.pt
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 347, in <lambda>
    lambda: hydra.run(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py", line 107, in run
    return run_job(
  File "/usr/local/lib/python3.10/dist-packages/hydra/core/utils.py", line 129, in run_job
    ret.return_value = task_function(task_cfg)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 464, in hydra_main
    distributed_utils.call_main(cfg, main)
  File "/data/code/fairseq/fairseq/distributed/utils.py", line 404, in call_main
    main(cfg, **kwargs)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 412, in main
    with InferenceProcessor(cfg) as processor:
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 107, in __init__
    models, saved_cfg = self.load_model_ensemble()
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 234, in load_model_ensemble
    models, saved_cfg = checkpoint_utils.load_model_ensemble(
  File "/data/code/fairseq/fairseq/checkpoint_utils.py", line 392, in load_model_ensemble
    ensemble, args, _task = load_model_ensemble_and_task(
  File "/data/code/fairseq/fairseq/checkpoint_utils.py", line 502, in load_model_ensemble_and_task
    model = task.build_model(cfg.model, from_checkpoint=True)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_finetuning.py", line 260, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_pretraining.py", line 227, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "/data/code/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model
    model = models.build_model(cfg, self, from_checkpoint)
  File "/data/code/fairseq/fairseq/models/__init__.py", line 106, in build_model
    return model.build_model(cfg, task)
  File "/data/code/fairseq/fairseq/models/wav2vec/wav2vec2_asr.py", line 224, in build_model
    w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary))
  File "/data/code/fairseq/fairseq/models/wav2vec/wav2vec2_asr.py", line 466, in __init__
    model = task.build_model(w2v_args.model, from_checkpoint=True)
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/tasks/audio_pretraining.py", line 227, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "/data/code/fairseq/fairseq/tasks/fairseq_task.py", line 355, in build_model
    model = models.build_model(cfg, self, from_checkpoint)
  File "/data/code/fairseq/fairseq/models/__init__.py", line 102, in build_model
    "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys())
KeyError: "'_name'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 517, in <module>
    cli_main()
  File "/data/code/TeleSpeech-ASR/data2vec_dialect/infer.py", line 513, in cli_main
    hydra_main()  # pylint: disable=no-value-for-parameter
  File "/usr/local/lib/python3.10/dist-packages/hydra/main.py", line 32, in decorated_main
    _run_hydra(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 346, in _run_hydra
    run_and_report(
  File "/usr/local/lib/python3.10/dist-packages/hydra/_internal/utils.py", line 267, in run_and_report
    print_exception(etype=None, value=ex, tb=final_tb)  # type: ignore
TypeError: print_exception() got an unexpected keyword argument 'etype'
root@22fa38c5cde3:/data/code/TeleSpeech-ASR/data2vec_dialect#

您好,根据报错信息来看,这里 2024-05-29 15:33:23 | ERROR | __main__ | Failed to import given user_module: /data/code/TeleSpeech-ASR/data2vec_dialect 并没有正确导入当前user_dir,从而导致fairseq的注册机制找不到finetune时使用的自定义model_class,触发了在build_model时的错误。请确认fairseq已正确安装,并且data2vec_dialect下的path.sh指定正确。

  1. 导入user_dir失败并不是我一个人的问题(我可以正常查看fairseq版本),这个issue作者也遇到这个问题,如果修改decode.sh中userdir后面加上/models就不会报错(但是代码还是跑不通)
  2. 报错Could not load task/spec_finetuning(没有这个任务选项).
  3. 如果引入的不是pt文件,而是文件夹,报错IsADirectoryError: [Errno 21] Is a directory: '/data/models/TeleSpeech-ASR1.0/' screenshot-20240529-185624
TTTdas commented 1 month ago
  1. 导入user_dir失败并不是我一个人的问题(我可以正常查看fairseq版本),这个issue作者也遇到这个问题,如果修改decode.sh中userdir后面加上/models就不会报错(但是代码还是跑不通)
  2. 报错Could not load task/spec_finetuning(没有这个任务选项).
  3. 如果引入的不是pt文件,而是文件夹,报错IsADirectoryError: [Errno 21] Is a directory: '/data/models/TeleSpeech-ASR1.0/' screenshot-20240529-185624
  1. 请问可以提供一下path.sh里的配置以及当前环境下的PYTHONPATH吗?
  2. user_dir必须指定到data2vec_dialect,如果指定到data2vec_dialect/model的话,目录下的task不会被引入,模型加载时会在原始fairseq中寻找task,所以会报could not load task/spec_finetuning的问题;同理,指定到task会导致model下自定义的模型无法使用。这部分的使用方式和fairseq原版代码是相同的。
  3. 给进去的model必须是pt模型,而不是文件夹,这个模型不需要显示地传入dict,您可以尝试通过在infer.py的141行self.generator构造后,添加print(self.generator.tgt_dict.symbols)就可以看到保存的dict。上述错误应该是代码在构造时的错误,与给入的模型/字典等无关。

我更新了infer.py代码,将importerror抛出了,这样代码会在报错Failed to import given user_module后就会中止,而不会继续向下运行,便于定位实际问题。

另外,数据准备时也需要获得40维mfcc特征,这个模型不支持原始音频输入

Christmas-Wong commented 1 month ago
  1. 导入user_dir失败并不是我一个人的问题(我可以正常查看fairseq版本),这个issue作者也遇到这个问题,如果修改decode.sh中userdir后面加上/models就不会报错(但是代码还是跑不通)
  2. 报错Could not load task/spec_finetuning(没有这个任务选项).
  3. 如果引入的不是pt文件,而是文件夹,报错IsADirectoryError: [Errno 21] Is a directory: '/data/models/TeleSpeech-ASR1.0/' screenshot-20240529-185624
  1. 请问可以提供一下path.sh里的配置以及当前环境下的PYTHONPATH吗?
  2. user_dir必须指定到data2vec_dialect,如果指定到data2vec_dialect/model的话,目录下的task不会被引入,模型加载时会在原始fairseq中寻找task,所以会报could not load task/spec_finetuning的问题;同理,指定到task会导致model下自定义的模型无法使用。这部分的使用方式和fairseq原版代码是相同的。
  3. 给进去的model必须是pt模型,而不是文件夹,这个模型不需要显示地传入dict,您可以尝试通过在infer.py的141行self.generator构造后,添加print(self.generator.tgt_dict.symbols)就可以看到保存的dict。上述错误应该是代码在构造时的错误,与给入的模型/字典等无关。

我更新了infer.py代码,将importerror抛出了,这样代码会在报错Failed to import given user_module后就会中止,而不会继续向下运行,便于定位实际问题。

另外,数据准备时也需要获得40维mfcc特征,这个模型不支持原始音频输入

PYTHONPATH="/data/code/fairseq:$PWD:$PYTHONPATH"
export PYTHONPATH
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
Christmas-Wong commented 1 month ago
  1. 导入user_dir失败并不是我一个人的问题(我可以正常查看fairseq版本),这个issue作者也遇到这个问题,如果修改decode.sh中userdir后面加上/models就不会报错(但是代码还是跑不通)
  2. 报错Could not load task/spec_finetuning(没有这个任务选项).
  3. 如果引入的不是pt文件,而是文件夹,报错IsADirectoryError: [Errno 21] Is a directory: '/data/models/TeleSpeech-ASR1.0/' screenshot-20240529-185624
  1. 请问可以提供一下path.sh里的配置以及当前环境下的PYTHONPATH吗?
  2. user_dir必须指定到data2vec_dialect,如果指定到data2vec_dialect/model的话,目录下的task不会被引入,模型加载时会在原始fairseq中寻找task,所以会报could not load task/spec_finetuning的问题;同理,指定到task会导致model下自定义的模型无法使用。这部分的使用方式和fairseq原版代码是相同的。
  3. 给进去的model必须是pt模型,而不是文件夹,这个模型不需要显示地传入dict,您可以尝试通过在infer.py的141行self.generator构造后,添加print(self.generator.tgt_dict.symbols)就可以看到保存的dict。上述错误应该是代码在构造时的错误,与给入的模型/字典等无关。

我更新了infer.py代码,将importerror抛出了,这样代码会在报错Failed to import given user_module后就会中止,而不会继续向下运行,便于定位实际问题。

另外,数据准备时也需要获得40维mfcc特征,这个模型不支持原始音频输入

已经解决, 没有安装timm库 建议修改infer代码,打印导入user_dir的报错内容(我用了print,建议改成logger):

def cli_main() -> None:
    try:
        from hydra._internal.utils import (
            get_args,
        )  # pylint: disable=import-outside-toplevel

        cfg_name = get_args().config_name or "infer"
        config = get_args()
        user_dir_override = next((item for item in config.overrides if item.startswith("common.user_dir")), None)
        if user_dir_override:
            user_dir = re.sub(r"common.user_dir=", "", user_dir_override)
            logger.info(f"import user_dir: {user_dir}")
            args = types.SimpleNamespace()
            args.user_dir = user_dir
            try:
                from fairseq.utils import import_user_module
                import_user_module(args)
            except Exception as e:
                logger.error(f"Failed to import given user_module: {user_dir}")
                print(e)
                raise e

另外这个finetune_large_kespeech模型的效果似乎很差,基本处于不可用的状态

zorionginn commented 1 month ago
  1. 导入user_dir失败并不是我一个人的问题(我可以正常查看fairseq版本),这个issue作者也遇到这个问题,如果修改decode.sh中userdir后面加上/models就不会报错(但是代码还是跑不通)
  2. 报错Could not load task/spec_finetuning(没有这个任务选项).
  3. 如果引入的不是pt文件,而是文件夹,报错IsADirectoryError: [Errno 21] Is a directory: '/data/models/TeleSpeech-ASR1.0/' screenshot-20240529-185624如果引入的不是pt文件,而是文件夹,报错IsADirectoryError: [Errno 21] Is a directory: '/data/models/TeleSpeech-ASR1.0/'
  1. 请问可以提供一下path.sh里的配置以及当前环境下的PYTHONPATH吗?
  2. user_dir必须指定到data2vec_dialect,如果指定到data2vec_dialect/model的话,目录下的task不会被引入,模型加载时会在原始fairseq中寻找task,所以会报could not load task/spec_finetuning的问题;同理,指定到task会导致model下自定义的模型无法使用。这部分的使用方式和fairseq原版代码是相同的。
  3. 给进去的model必须是pt模型,而不是文件夹,这个模型不需要显示地传入dict,您可以尝试通过在infer.py的141行self.generator构造后,添加print(self.generator.tgt_dict.symbols)就可以看到保存的dict。上述错误应该是代码在构造时的错误,与给入的模型/字典等无关。

我更新了infer.py代码,将importerror抛出了,这样代码会在报错Failed to import given user_module后就会中止,而不会继续向下运行,便于定位实际问题。 另外,数据准备时也需要获得40维mfcc特征,这个模型不支持原始音频输入

已经解决, 没有安装timm库 建议修改infer代码,打印导入user_dir的报错内容(我用了print,建议改成logger):

def cli_main() -> None:
    try:
        from hydra._internal.utils import (
            get_args,
        )  # pylint: disable=import-outside-toplevel

        cfg_name = get_args().config_name or "infer"
        config = get_args()
        user_dir_override = next((item for item in config.overrides if item.startswith("common.user_dir")), None)
        if user_dir_override:
            user_dir = re.sub(r"common.user_dir=", "", user_dir_override)
            logger.info(f"import user_dir: {user_dir}")
            args = types.SimpleNamespace()
            args.user_dir = user_dir
            try:
                from fairseq.utils import import_user_module
                import_user_module(args)
            except Exception as e:
                logger.error(f"Failed to import given user_module: {user_dir}")
                print(e)
                raise e

另外这个finetune_large_kespeech模型的效果似乎很差,基本处于不可用的状态

能解码出内容么?我这边解码结果都是空

TTTdas commented 1 month ago

谢谢提醒。测试数据是什么数据集呢?这个模型需要16k的音频提取40维特征进行计算

Christmas-Wong commented 1 month ago

谢谢提醒。测试数据是什么数据集呢?这个模型需要16k的音频提取40维特征进行计算

自有数据,只能对16K的音频有效吗?我频率是44k,跑通了,但是效果很差,我再继续研究研究

Christmas-Wong commented 1 month ago
  1. 导入user_dir失败并不是我一个人的问题(我可以正常查看fairseq版本),这个issue作者也遇到这个问题,如果修改decode.sh中userdir后面加上/models就不会报错(但是代码还是跑不通)
  2. 报错Could not load task/spec_finetuning(没有这个任务选项).
  3. 如果引入的不是pt文件,而是文件夹,报错IsADirectoryError: [Errno 21] Is a directory: '/data/models/TeleSpeech-ASR1.0/' screenshot-20240529-185624如果引入的不是pt文件,而是文件夹,报错IsADirectoryError: [Errno 21] Is a directory: '/data/models/TeleSpeech-ASR1.0/'
  1. 请问可以提供一下path.sh里的配置以及当前环境下的PYTHONPATH吗?
  2. user_dir必须指定到data2vec_dialect,如果指定到data2vec_dialect/model的话,目录下的task不会被引入,模型加载时会在原始fairseq中寻找task,所以会报could not load task/spec_finetuning的问题;同理,指定到task会导致model下自定义的模型无法使用。这部分的使用方式和fairseq原版代码是相同的。
  3. 给进去的model必须是pt模型,而不是文件夹,这个模型不需要显示地传入dict,您可以尝试通过在infer.py的141行self.generator构造后,添加print(self.generator.tgt_dict.symbols)就可以看到保存的dict。上述错误应该是代码在构造时的错误,与给入的模型/字典等无关。

我更新了infer.py代码,将importerror抛出了,这样代码会在报错Failed to import given user_module后就会中止,而不会继续向下运行,便于定位实际问题。 另外,数据准备时也需要获得40维mfcc特征,这个模型不支持原始音频输入

已经解决, 没有安装timm库 建议修改infer代码,打印导入user_dir的报错内容(我用了print,建议改成logger):

def cli_main() -> None:
    try:
        from hydra._internal.utils import (
            get_args,
        )  # pylint: disable=import-outside-toplevel

        cfg_name = get_args().config_name or "infer"
        config = get_args()
        user_dir_override = next((item for item in config.overrides if item.startswith("common.user_dir")), None)
        if user_dir_override:
            user_dir = re.sub(r"common.user_dir=", "", user_dir_override)
            logger.info(f"import user_dir: {user_dir}")
            args = types.SimpleNamespace()
            args.user_dir = user_dir
            try:
                from fairseq.utils import import_user_module
                import_user_module(args)
            except Exception as e:
                logger.error(f"Failed to import given user_module: {user_dir}")
                print(e)
                raise e

另外这个finetune_large_kespeech模型的效果似乎很差,基本处于不可用的状态

能解码出内容么?我这边解码结果都是空

可以的,kaldi提取特征那里你有结果输出吗?

TTTdas commented 1 month ago

谢谢提醒。测试数据是什么数据集呢?这个模型需要16k的音频提取40维特征进行计算

自有数据,只能对16K的音频有效吗?我频率是44k,跑通了,但是效果很差,我再继续研究研究

是的,8k和44k音频识别效果都会很差。您可以尝试使用sox, ffmpeg或者librosa等将音频降采

zorionginn commented 1 month ago

模型是基于wav2vec_ctc么? 我在模型参数里面并没有找到最后的linear(1024 -> 7535) 能解码出来是因为finetune过?

Christmas-Wong commented 1 month ago

谢谢提醒。测试数据是什么数据集呢?这个模型需要16k的音频提取40维特征进行计算

自有数据,只能对16K的音频有效吗?我频率是44k,跑通了,但是效果很差,我再继续研究研究

是的,8k和44k音频识别效果都会很差。您可以尝试使用sox, ffmpeg或者librosa等将音频降采

好的,我试一下

Christmas-Wong commented 1 month ago

谢谢提醒。测试数据是什么数据集呢?这个模型需要16k的音频提取40维特征进行计算

自有数据,只能对16K的音频有效吗?我频率是44k,跑通了,但是效果很差,我再继续研究研究

是的,8k和44k音频识别效果都会很差。您可以尝试使用sox, ffmpeg或者librosa等将音频降采

四川方言有很好的效果 粤语比一般的asr好一点,但是还是不可用 湖南话效果也比较差 普通话没有充分训练,还有提升空间

TTTdas commented 1 month ago

谢谢提醒。测试数据是什么数据集呢?这个模型需要16k的音频提取40维特征进行计算

自有数据,只能对16K的音频有效吗?我频率是44k,跑通了,但是效果很差,我再继续研究研究

是的,8k和44k音频识别效果都会很差。您可以尝试使用sox, ffmpeg或者librosa等将音频降采

四川方言有很好的效果 粤语比一般的asr好一点,但是还是不可用 湖南话效果也比较差 普通话没有充分训练,还有提升空间

谢谢测试结果!是的,因为这个可直接识别的开源模型是基于KeSpeech公开数据集微调的,这个数据集没有粤语有标注数据,因此粤语效果会比较差。所以我们主要开源了两个无监督预训练模型,方便使用者在自己所需的方言上进行微调

TTTdas commented 1 month ago

模型是基于wav2vec_ctc么? 我在模型参数里面并没有找到最后的linear(1024 -> 7535) 能解码出来是因为finetune过?

KeSpeech微调的模型是基于wav2vec_ctc结构的,因此可以直接解码。两个预训练模型是无监督训练得到的,本身没有linear层,也不支持直接解码

qazwsx921028 commented 1 month ago

这个模型怎么测试呀?有没有代码呀