I get the following error when trying to use the GWB trained LM.
$ CUDA_VISIBLE_DEVICES=0 python eval_lm.py data-bin/my_csk/my_csk_gbw --path 'models/gbw_fconv_lm/model.pt' --output-word-probs
Namespace(cpu=False, data='data-bin/my_csk/my_csk_gbw', fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, future_target=False, gen_subset='test', log_format=None, log_interval=1000, max_sentences=None, max_tokens=None, model_overrides='{}', no_progress_bar=False, num_shards=1, output_dictionary_size=-1, output_word_probs=True, output_word_stats=False, past_target=False, path='models/gbw_fconv_lm/model.pt', quiet=False, raw_text=False, remove_bpe=None, sample_break_mode=None, seed=1, self_target=False, shard_id=0, skip_invalid_size_inputs_valid_test=False, task='language_modeling', tokens_per_sample=1024)
| dictionary: 793304 types
| loading model(s) from models/gbw_fconv_lm/model.pt
Traceback (most recent call last):
File "eval_lm.py", line 189, in <module>
main(args)
File "eval_lm.py", line 58, in main
models, args = utils.load_ensemble_for_inference(parsed_args.path.split(':'), task, model_arg_overrides=eval(parsed_args.model_overrides))
File "/raid/data/oanuru/my_fairseq/my_fairseq/fairseq/utils.py", line 165, in load_ensemble_for_inference
model.load_state_dict(state['model'], strict=True)
File "/raid/data/oanuru/my_fairseq/my_fairseq/fairseq/models/fairseq_model.py", line 66, in load_state_dict
super().load_state_dict(state_dict, strict)
File "/data/dgx1/oanuru/anaconda3/envs/fairseq/lib/python3.7/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for FConvLanguageModel:
size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([793302, 128]) from checkpoint, the shape in current model is torch.Size([793304, 128]).
size mismatch for decoder.adaptive_softmax.tail.2.2.weight: copying a param with shape torch.Size([593302, 256]) from checkpoint, the shape in current model is torch.Size([593304, 256]).
Hi,
I get the following error when trying to use the GWB trained LM.