I'm having the following problem running the setup with distillation and I'm looking forward to your help:
File "/mnt/cpfs/users/gpuwork/lilujun/2021/deit/main.py", line 420, in <module>
main(args)
File "/mnt/cpfs/users/gpuwork/lilujun/2021/deit/main.py", line 379, in main
set_training_mode=True # keep in eval mode during finetuning
File "/mnt/cpfs/users/gpuwork/lilujun/2021/deit/engine.py", line 39, in train_one_epoch
loss = criterion(samples, outputs, targets)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/cpfs/users/gpuwork/lilujun/2021/deit/losses.py", line 43, in forward
raise ValueError("When knowledge distillation is enabled, the model is "
ValueError: When knowledge distillation is enabled, the model is expected to return a Tuple[Tensor, Tensor] with the output of the class_token and the dist_token
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in <module>
main()
File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main
cmd=cmd)
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
Hi @lilujunai,
It looks like you are using distillation but your model does not return a tuple as output.
In what context do you use the code finetuning or training from scratch?
Best,
Hugo
I'm having the following problem running the setup with distillation and I'm looking forward to your help: