errors while optimize_parameters() from your pretrained models

ruomingzhai commented 1 year ago

Hi, I downloaded your pretrained model for reproducing the whole training process. But it encounters this bug as follows: File "/root/DeepViewAgg/torch_points3d/models/base_model.py", line 259, in optimize_parameters self._grad_scale.step(self._optimizer) # update parameters AttributeError: 'NoneType' object has no attribute “step"

It seems in the optimize_parameters() function in torch_points3d/models/base_model.py the self._grad_scale is None. I thought it may derive from the checkpoint.pt file but apparently it doesn't. It only initiates in instantiate_optimizers() with the torch.cuda.amp.GradScaler class. I am not sure where went wrong. So I hope I can get some clues from you.

drprojects commented 1 year ago

Hi @ruomingzhai,

I think this might be due to the fact that torch-points3d does not like loading optimizers with differential learning rates. Indeed, to train the 3D+2D models, some blocks of the model have different learning rates (eg the 2D blocks vs the 3D blocks). When loading the optimizer, this might cause issues.

As of now, the project does not support loading pretrained optimizer and scheduler to fine-tune on. If you want to reproduce the training experiments, use scripts/train_kitti360.sh. If you want to infer using the pretrained weights, use notebooks/kitti360_inference.ipynb.

If you want to fine-tune a pretrained model, you will need to create a new optimizer and scheduler anyways, which should bypass the problem you encountered. For that, you can simply follow the procedure in scripts/train_kitti360.sh, with a few changes:

add the path to the pretrained model in your model config:

<model_name>:
path_pretrained: /path/to/the/pretrained/model/<model_name>.pt
...

adapt the training config to suit your needs, especially for the learning rates, which you might want to reduce
adapt the lr_scheduler config to suit your needs

Hope that helps !

drprojects commented 1 year ago

Hi, assuming the last reply addressed the question, I am closing the issue.

drprojects / DeepViewAgg

errors while optimize_parameters() from your pretrained models #17