Open gld17 opened 2 years ago
Thank you for this excellent work!
I have some questions about GFK distillation loss in your code when reproduce your results on CIFAR100. It seems that the loss function in https://github.com/chrysts/geodesic_continual_learning/blob/8a5ba6310541f2643660ebc8b66c304ffd392fd9/algorithms/GFK_distill_cosine.py is the form in paper Equ(9). However, you choose the GFK_distill_normalize.py and I don't understand the derivation process. I'm wondering if there's something I miss.
Thank you so much for your time and consideration, and best wishes!
Thank you for this excellent work!
I have some questions about GFK distillation loss in your code when reproduce your results on CIFAR100. It seems that the loss function in https://github.com/chrysts/geodesic_continual_learning/blob/8a5ba6310541f2643660ebc8b66c304ffd392fd9/algorithms/GFK_distill_cosine.py is the form in paper Equ(9). However, you choose the GFK_distill_normalize.py and I don't understand the derivation process. I'm wondering if there's something I miss.
Thank you so much for your time and consideration, and best wishes!