teinhonglo commented 2 years ago

Hi,

I am trying to load pretrained model (Hubert_large) in espnet setup, but I failed.

The steps are listed here:

git clone https://huggingface.co/TencentGameMate/chinese-hubert-large hub/s3prl_cache/chinese-hubert-large

The modify asr_train_config :


num_workers: 8
batch_type: numel
batch_bins: 4000000
accum_grad: 4
max_epoch: 50
patience: none
init: none
best_model_criterion:
-   - valid
- acc
- max
keep_nbest_models: 10
freeze_param: [
"frontend.upstream"
]

frontend: s3prl frontend_conf: frontend_conf: upstream: hubert_local upstream_model_config: "hub/s3prl_cache/chinese-hubert-large/config.json" upstream_ckpt: "hub/s3prl_cache/chinese-hubert-large/chinese-hubert-large-fairseq-ckpt.pt" multilayer_feature: true

preencoder: linear preencoder_conf: input_size: 1024 # Note: If the upstream is changed, please change this value accordingly. output_size: 80

encoder: conformer encoder_conf: output_size: 256 attention_heads: 4 linear_units: 2048 num_blocks: 12 dropout_rate: 0.1 positional_dropout_rate: 0.1 attention_dropout_rate: 0.1 input_layer: conv2d2 normalize_before: true macaron_style: true pos_enc_layer_type: "rel_pos" selfattention_layer_type: "rel_selfattn" activation_type: "swish" use_cnn_module: true

4. Error Message

Traceback (most recent call last): File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/teinhonglo/espnets/espnet/espnet2/bin/asr_train.py", line 23, in main() File "/home/teinhonglo/espnets/espnet/espnet2/bin/asr_train.py", line 19, in main ASRTask.main(cmd=cmd) File "/home/teinhonglo/espnets/espnet/espnet2/tasks/abs_task.py", line 1013, in main cls.main_worker(args) File "/home/teinhonglo/espnets/espnet/espnet2/tasks/abs_task.py", line 1115, in main_worker model = cls.build_model(args=args) File "/home/teinhonglo/espnets/espnet/espnet2/tasks/asr.py", line 415, in build_model frontend = frontend_class(args.frontend_conf) File "/home/teinhonglo/espnets/espnet/espnet2/asr/frontend/s3prl.py", line 47, in init self.upstream, self.featurizer = self._get_upstream(frontend_conf) File "/home/teinhonglo/espnets/espnet/espnet2/asr/frontend/s3prl.py", line 68, in _get_upstream s3prl_upstream = torch.hub.load( File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/torch/hub.py", line 404, in load model = _load_local(repo_or_dir, model, *args, *kwargs) File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/torch/hub.py", line 433, in _load_local model = entry(args, kwargs) File "/home/teinhonglo/espnets/espnet/tools/s3prl/s3prl/upstream/hubert/hubconf.py", line 27, in hubert_local return _UpstreamExpert(ckpt, *args, *kwargs) File "/home/teinhonglo/espnets/espnet/tools/s3prl/s3prl/upstream/interfaces.py", line 30, in call instance = super().call(args, kwargs) File "/home/teinhonglo/espnets/espnet/tools/s3prl/s3prl/upstream/hubert/expert.py", line 42, in init model, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task( File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/fairseq/checkpoint_utils.py", line 421, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/fairseq/checkpoint_utils.py", line 315, in load_checkpoint_to_cpu state = torch.load(f, map_location=torch.device("cpu")) File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/torch/serialization.py", line 713, in load return _legacy_load(opened_file, map_location, pickle_module, pickle_load_args) File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/torch/serialization.py", line 920, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.

Accounting: time=17 threads=1



Any suggestion?
Thanks in advance.

LiuShixing commented 2 years ago

It seems error in "model, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task", you can try this alone to check. I used fairseq==0.12.2 and loaded successfuly

teinhonglo commented 2 years ago

The version of fairseq I used is 0.12.2. What do you mean try this alone to check?

LiuShixing commented 2 years ago

I mean you can load by "fairseq.checkpoint_utils.load_model_ensemble_and_task" to debug

teinhonglo commented 2 years ago

Thanks for the suggestion.

The result I try is:

>>> import fairseq
>>> fairseq
<module 'fairseq' from '/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/fairseq/__init__.py'>
>>> fairseq.__version__
'0.12.2'
>>> model, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task(["hub/s3prl_cache/chinese-hubert-large/chinese-hubert-large-fairseq-ckpt.pt"])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/fairseq/checkpoint_utils.py", line 425, in load_model_ensemble_and_task
    state = load_checkpoint_to_cpu(filename, arg_overrides)
  File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/fairseq/checkpoint_utils.py", line 315, in load_checkpoint_to_cpu
    state = torch.load(f, map_location=torch.device("cpu"))
  File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/torch/serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/teinhonglo/espnets/espnet/tools/anaconda/envs/espnet/lib/python3.8/site-packages/torch/serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

LiuShixing commented 2 years ago

you can check the "chinese-hubert-large-fairseq-ckpt.pt" file, it's size should be 3.5G

teinhonglo commented 2 years ago

Your suggestion is correct. It works after I re-download the model.

Thank you.

abcdbosh commented 1 year ago

would like to ask which version of s3prl is, I have an incompatible problem with omegaconf package when installing

abcdbosh commented 1 year ago

你好：请问用fairseq.checkpoint_utils.load_model_ensemble_and_task导入本地.pt成功，但stage10会报错如下 Traceback (most recent call last): File "/home/shiyanshi/anaconda3/envs/espnet1/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/shiyanshi/anaconda3/envs/espnet1/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/media/shiyanshi/F/2021_s/espnet/espnet2/bin/asr_train.py", line 23, in main() File "/media/shiyanshi/F/2021_s/espnet/espnet2/bin/asr_train.py", line 19, in main ASRTask.main(cmd=cmd) File "/media/shiyanshi/F/2021_s/espnet/espnet2/tasks/abs_task.py", line 1019, in main cls.main_worker(args) File "/media/shiyanshi/F/2021_s/espnet/espnet2/tasks/abs_task.py", line 1121, in main_worker model = cls.build_model(args=args) File "/media/shiyanshi/F/2021_s/espnet/espnet2/tasks/asr.py", line 417, in build_model frontend = frontend_class(*args.frontend_conf) File "/media/shiyanshi/F/2021_s/espnet/espnet2/asr/frontend/s3prl.py", line 46, in init upstream = S3PRLUpstream( File "/home/shiyanshi/anaconda3/envs/espnet1/lib/python3.8/site-packages/s3prl/nn/upstream.py", line 76, in init self.upstream = getattr(hub, name)(ckpt=path_or_url, refresh=refresh) File "/home/shiyanshi/anaconda3/envs/espnet1/lib/python3.8/site-packages/s3prl/upstream/hubert/hubconf.py", line 33, in hubert_local return hubert_custom(args, **kwargs) File "/home/shiyanshi/anaconda3/envs/espnet1/lib/python3.8/site-packages/s3prl/upstream/hubert/hubconf.py", line 22, in hubert_custom if ckpt.startswith("http"): AttributeError: 'NoneType' object has no attribute 'startswith'

Accounting: time=6 threads=1

Ended (code 1) at Sun Oct 2 18:11:29 CST 2022, elapsed time 6 seconds

该怎么解决呢

sendream commented 1 year ago

你好：请问用fairseq.checkpoint_utils.load_model_ensemble_and_task导入本地.pt成功，但stage10会报错如下 Traceback (most recent call last): File "/home/shiyanshi/anaconda3/envs/espnet1/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/shiyanshi/anaconda3/envs/espnet1/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/media/shiyanshi/F/2021_s/espnet/espnet2/bin/asr_train.py", line 23, in main() File "/media/shiyanshi/F/2021_s/espnet/espnet2/bin/asr_train.py", line 19, in main ASRTask.main(cmd=cmd) File "/media/shiyanshi/F/2021_s/espnet/espnet2/tasks/abs_task.py", line 1019, in main cls.main_worker(args) File "/media/shiyanshi/F/2021_s/espnet/espnet2/tasks/abs_task.py", line 1121, in main_worker model = cls.build_model(args=args) File "/media/shiyanshi/F/2021_s/espnet/espnet2/tasks/asr.py", line 417, in build_model frontend = frontend_class(args.frontend_conf) File "/media/shiyanshi/F/2021_s/espnet/espnet2/asr/frontend/s3prl.py", line 46, in init upstream = S3PRLUpstream( File "/home/shiyanshi/anaconda3/envs/espnet1/lib/python3.8/site-packages/s3prl/nn/upstream.py", line 76, in init* self.upstream = getattr(hub, name)(ckpt=path_or_url, refresh=refresh) File "/home/shiyanshi/anaconda3/envs/espnet1/lib/python3.8/site-packages/s3prl/upstream/hubert/hubconf.py", line 33, in hubert_local return hubert_custom(args, **kwargs) File "/home/shiyanshi/anaconda3/envs/espnet1/lib/python3.8/site-packages/s3prl/upstream/hubert/hubconf.py", line 22, in hubert_custom if ckpt.startswith("http"): AttributeError: 'NoneType' object has no attribute 'startswith'

Accounting: time=6 threads=1

Ended (code 1) at Sun Oct 2 18:11:29 CST 2022, elapsed time 6 seconds

该怎么解决呢

您好，你这个问题解决了吗？我也遇上了

a136522541 commented 1 year ago

踩了两天的坑，发现应该是因为s3prl和espnet的版本不对的问题，退回到6月底的版本的时候就可以运行成功了

sendream commented 1 year ago

是的，我也发现了这个问题，但是我把所有版本都更新到最新解决了这个问题，并且要将模型转换，使用这个代码这是我的新配置

pretrained model related

freeze_param: [ "frontend.upstream" ]

frontend: s3prl frontend_conf: frontend_conf: upstream: hubert_local path_or_url: /root/data/espnet/hubert-base/converted_ckpts/chinese-hubert-base.pt download_dir: ./hub multilayer_feature: true

preencoder: linear preencoder_conf: input_size: 768 # Note: If the upstream is changed, please change this value accordingly. output_size: 80

Halfpast7 commented 1 year ago

踩了两天的坑，发现应该是因为s3prl和espnet的版本不对的问题，退回到6月底的版本的时候就可以运行成功了

打扰，能求一个具体各工具包的版本吗，是espnet和s3prl的版本都要回退吗，fairsqe呢

TencentGameMate / chinese_speech_pretrain

Failed to load pretrained model from huggingface #10

Accounting: time=17 threads=1

Accounting: time=6 threads=1

Ended (code 1) at Sun Oct 2 18:11:29 CST 2022, elapsed time 6 seconds

Accounting: time=6 threads=1

Ended (code 1) at Sun Oct 2 18:11:29 CST 2022, elapsed time 6 seconds

pretrained model related