While evaluating UltraFastBERT (a downstream project using the repository at https://github.com/pbelcak/UltraFastBERT under the training folder, with most of the code identical), I encountered the following error when running python eval.py eval=GLUE name=UltraFastBERT-1x11-long eval.checkpoint=hf://pbelcak/UltraFastBERT-1x11-long impl.microbatch_size=4d:
loaded with 164,460,531 parameters.
Some weights of ScriptableLMForSequenceClassification were not initialized from the model checkpoint at pbelcak/UltraFastBERT-1x11-long and are newly initialized: ['pooler.dense.weight', 'head.weight', 'head.bias', 'pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Error executing job with overrides: ['eval=GLUE', 'name=UltraFastBERT-1x11-long', 'eval.checkpoint=hf://pbelcak/UltraFastBERT-1x11-long', 'impl.microbatch_size=4']
Traceback (most recent call last):
File "/root/autodl-tmp/UltraFastBERT/training/eval.py", line 147, in launch
cramming.utils.main_launcher(cfg, main_downstream_process, job_name="downstream finetuning")
File "/root/autodl-tmp/UltraFastBERT/training/cramming/utils.py", line 54, in main_launcher
metrics = main_fn(cfg, setup)
File "/root/autodl-tmp/UltraFastBERT/training/eval.py", line 37, in main_downstream_process
model_engine.load_checkpoint(cfg_arch, model_file)
File "/root/autodl-tmp/UltraFastBERT/training/cramming/backend/torch_default.py", line 237, in load_checkpoint
self.optimizer, self.scheduler = _load_optimizer(self.model, self.cfg_train, self.cfg_impl)
TypeError: _load_optimizer() missing 1 required positional argument: 'initial_time'
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
While evaluating UltraFastBERT (a downstream project using the repository at https://github.com/pbelcak/UltraFastBERT under the
training
folder, with most of the code identical), I encountered the following error when runningpython eval.py eval=GLUE name=UltraFastBERT-1x11-long eval.checkpoint=hf://pbelcak/UltraFastBERT-1x11-long impl.microbatch_size=4d
:And indeed, line 237 of the file calls
_load_optimizer
with just 3 arguments instead of 4: https://github.com/JonasGeiping/cramming/blob/f6ba4cb76ff7847ecc64067b3e7eaa1eed9625a5/cramming/backend/torch_default.py#L237Maybe add
self.initial_time
as the fourth argument?