Closed MARD1NO closed 4 years ago
The provided code uses cdist to compute the L1 distance https://github.com/huawei-noah/AdderNet/blob/c49d06dbf5da44b9b754e30da14138097837707a/adder.py#L28
To realize the new back propagation, you can follow https://pytorch.org/docs/master/notes/extending.html to write a function to replace cdist, where the forward is the L1 norm while the backward use the new back propagation and the adaptive learning rate.
I wonder know how you realize your new back propagation and the adaptive learning rate in each layer? will you open source the relevant code in future?