choasma / HSIC-bottleneck

The HSIC Bottleneck: Deep Learning without Back-Propagation
https://arxiv.org/abs/1908.01580
MIT License
81 stars 17 forks source link

Clarification about Format Training #8

Closed Yocodeyo closed 4 years ago

Yocodeyo commented 4 years ago

Hi Ma, may I clarify with you about how you did format training? From my understanding, you're going to take the one-hot from HSIC Bottleneck training to pass into a simple layer to do the classification. However, based on your code, it seems that you train hsic model first, load the weights as the initial weights, then combine hsic model and vanilla model and train them together as a whole? Thank you very much!

choasma commented 4 years ago

That's correct. We intentionally separated the training into two stages: unformatted training (or hsic training), and formatted training on vanilla model. The goal of this is to hope the unformatted training have sufficient information about the training target and forget about the input, such that the optimized information makes the formatted training (or simple classifier) can do its best job. The HSIC solve you asked before is the very special case that can solve classification directly, as the hsic model output dimension is 10 and it shows the classed activation peak permutation visually.

Back to the subject, we first do unformatted training on HSIC model. Then load/fix the weight and use the output of HSIC model for vanilla training. So either they don't require back-propagation during the training.

Yocodeyo commented 4 years ago

Okay thanks for the clear explanation!:)

choasma commented 4 years ago

No worries. Always let me know if you have any questions about our project. Good luck