irfanICMLL / structure_knowledge_distillation

The official code for the paper 'Structured Knowledge Distillation for Semantic Segmentation'. (CVPR 2019 ORAL) and extension to other tasks.
BSD 2-Clause "Simplified" License
702 stars 104 forks source link

nan loss value #25

Closed sungsooo closed 4 years ago

sungsooo commented 4 years ago

Hi, I have a problem while training your project. I just clone your repo, and only changed the data path.

But, when I train with PI + PA + HO losses, the loss became nan value after several steps. I used the teacher weights you provide, trained the student net from scratch. Can you advise to reproduce your paper? If you don't mind, can you share your log file?

Thank you.

irfanICMLL commented 4 years ago

Please checkout this branch: d1ec858edc25e2671e9a15d5fda4628b9fdbf48b It will fix the nan problem.

sungsooo commented 4 years ago

Thank you for your reply! I want to know the difference between the two branches. Could you explain why the master branch has occurred the nan problem? Thank you!

irfanICMLL commented 4 years ago

We have employed an unstable form of KL divergence. We will fix this bug recently. But the old branch can also have promising results for distillation.

wuzhiyang2016 commented 4 years ago

hello, is checkout to master branch?