Closed ThomasNorr closed 1 year ago
Hi Gnabe, sorry for the delay in answering and thanks for your interest in the project!
Since there are no normalisation layers in the model, it is indeed somewhat more sensitive to a good choice of hyperparameters; in fact, we are currently investigating how to facilitate training for the B-cos networks. Regarding your questions:
I hope this helps!
Best Moritz
Hi Moritz,
thanks for the answer.
Thanks a lot :)
Best
Hello,
thanks for your fascinating work. I am trying to use the B-cos network (the densenet121 named “densenet_121_cossched”) in my research but I struggle with having it transfer effectively to smaller datasets, e.g. CUB2011. In fact, it overfits much more ( much worse final test acc) and improves a lot slower than the conventional densenet (In fact, only retraining the final layer leads to no learning whatsoever across a range of hyperparameters that all work for the conventional one). Since you have experience with training this network, I figure I might just ask you:
Any answers would be greatly appreciated :)
Greetings