When one uses run_simple() with different models of the same type roberta-base and roberta-large the run crashes because the code assumes they are the same model because weights are saved under hf_config.model_type (instead of args.hf_pretrained_model_name_or_path.). As such, the code tries to load incompatible weights and crashes.
To Reproduce
Install jiant
Run the simple example in README
Change the model in the sample from 'roberta-basetoroberta-large`
Expected behavior
One should be able to run run_simple() with different models of the same type.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Solution: The hf_config.model_type should be used for caching tokenizer / tasks. The args.hf_pretrained_model_name_or_path for the weights.
Describe the bug
When one uses run_simple() with different models of the same type
roberta-base
androberta-large
the run crashes because the code assumes they are the same model because weights are saved underhf_config.model_type
(instead ofargs.hf_pretrained_model_name_or_path.
). As such, the code tries to load incompatible weights and crashes.To Reproduce
to
roberta-large`Expected behavior One should be able to run run_simple() with different models of the same type.
Screenshots If applicable, add screenshots to help explain your problem.
Additional context Solution: The
hf_config.model_type
should be used for caching tokenizer / tasks. Theargs.hf_pretrained_model_name_or_path
for the weights.