Question about the pair-wise loss

irfanICMLL / structure_knowledge_distillation

The official code for the paper 'Structured Knowledge Distillation for Semantic Segmentation'. (CVPR 2019 ORAL) and extension to other tasks.

BSD 2-Clause "Simplified" License

702 stars 104 forks source link

It is a discussion about the pair-wise loss. The results show that keeping a fully connect range is the best choice considering the performance, and use a large \beta will help improve the results. We use a pooling operator for the \beta to achieve the best choice in our paper.

On P6 of our paper, we can get a conclusion from the table. "One can choose to use the local patch to decrease the number of the nodes, instead of decreasing the connection range for a better trade-off between efficiency and accuracy"

irfanICMLL / structure_knowledge_distillation

Question about the pair-wise loss #31