Closed ManuBN786 closed 1 year ago
Hi @ManuBN786 ,
What is the pytorch version you used?
I'm using torch 1.11.0 on Cuda 11.3
The torch 1.11.0 works correctly on my server.
It seems that you want to train the resnet34 model for 50 epochs, right?
Your training script is incorrect, and the correct command should be (just replace --model-config
with -c
)
python tools/train.py --dataset imagenet --data-path data/imagenet/ --model resnet34 -c configs/strategies/resnet/resnet.yaml -b 16 --experiment teacher_model_train --epochs 50
You can try this script and see whether your error is solved :)
I used "python tools/train.py --dataset imagenet --data-path data/imagenet/ --model resnet34 -c configs/strategies/resnet/resnet.yaml --teacher-no-pretrained -b 16 --experiment teacher_model_train --epochs 50"
And it trains perfectly fine.
Thank you so much
I was training a resnet34 and this is the error:
11:20:09 INFO Train: 3 [ 0/246] Loss: 1.712 (1.712) LR: 1.000e-02 Time: 0.84s (0.84s) Data: 0.55s /home/manu/.virtualenvs/dl4cv/lib/python3.8/site-packages/torch/optim/lr_scheduler.py:371: UserWarning: To get the last learning rate computed by the scheduler, please use
main()
File "tools/train.py", line 200, in main
metrics = train_epoch(args, epoch, model, model_ema, train_loader,
File "tools/train.py", line 317, in train_epoch
scheduler.step(epoch len(loader) + batch_idx + 1)
File "/home/manu/PycharmProjects/DIST_KD/classification/tools/utils/scheduler.py", line 92, in step
self.after_scheduler.step(epoch - self.total_epoch - 1)
File "/home/manu/.virtualenvs/dl4cv/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 159, in step
values = self._get_closed_form_lr()
File "/home/manu/.virtualenvs/dl4cv/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 380, in _get_closed_form_lr
return [base_lr self.gamma * (self.last_epoch // self.step_size)
File "/home/manu/.virtualenvs/dl4cv/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 380, in
return [base_lr self.gamma (self.last_epoch // self.step_size)
TypeError: unsupported operand type(s) for or pow(): 'NoneType' and 'int'
get_last_lr()
. warnings.warn("To get the last learning rate computed by the scheduler, " Traceback (most recent call last): File "tools/train.py", line 363, inThe command used is : python tools/train.py --dataset imagenet --data-path data/imagenet/ --model resnet34 --model-config configs/strategies/resnet/resnet.yaml --teacher-no-pretrained -b 16 --experiment teacher_model_train --epochs 50