GeorgeCazenavette / mtt-distillation

Official code for our CVPR '22 paper "Dataset Distillation by Matching Training Trajectories"
https://georgecazenavette.github.io/mtt-distillation/
Other
389 stars 54 forks source link

Grand loss curve #27

Open XuyangZhong-29 opened 1 year ago

XuyangZhong-29 commented 1 year ago

Hi,

I tried to reproduce your method on ResNet18, and I set a proper lr (lr_img=100, lr_lr=1e-5, lr_teacher=0.01) to avoid exploding/vanishing gradient. However, I observed that the grand loss fluctuates around 0.9. Is it normal in this case? Could you please share your grand loss curve for reference?

BR, Xuyang

GeorgeCazenavette commented 1 year ago

Hi, sorry for the late response.

We never actually did any distillation on networks other than ConvNet, so I unfortunately don't have a ResNet18 loss curve to share with you.