Closed know-nothing8 closed 3 years ago
Hi @know-nothing8, try the following solutions, and let me know if you have any problem:
use_reduction_sum
to False
when calling the fit
methodset_optimizer
methodThe training loss turns into nan
typically means that the current configuration on the optimizer does not adapt well with the problem appropriately. Therefore, my first suggestion is to use a smaller learning rate ;-)
Thank you very much for your help. After I changed the learning rate from 1e-1 to 1e-3, it still doesn't work. But after I changed the optimizer from "SGD" to "Adam", it can be trained normally! 👍
If I remember correctly, the performance of ResNet should be slightly better when using the SGD optimizer. You could also try to use a smaller momentum factor.
I am going to close this issue since it is more of a problem on how to optimize a specific deep learning model. Thanks for reporting. 😄
Hello, I am learning deep ensemble methods. Your work is very good and helps me a lot. But I encountered a problem when applying gradient boosting to Resnet18 on Cifar10.
When applying ensemble algorithms, such as Bagging and Fast Geometric, to Resnet18 on the Cifar10 dataset, they work normally. However, Gradient Boosting Classifier and Soft Gradient Boosting Classifier cannot improve performance through training.
After checking, I found that the latter two used the "pseudo-residual function" instead of the cross-entropy function in the classification. Is this the reason for the inability to train?
I believe I overlooked some details. Because there are some predictions of Gradient Boosting in the file ./docs/plotting/resnet_cifar10.py.
Below is a code snippet that I applied Soft Gradient Boosting to Resnet18 on Cifar10:
This is the result returned by the console
./SoftGradientBoostingClassifier_ResNet_3_ckpt.pth
How should I modify this code? Thank you so much for your patience ^_^