Open brisker opened 3 years ago
Hi, thanks for asking!
We have a check.py
file to guarantee both the forward and backward outputs from our CUDA codes are the same with the PyTorch version implementation, while largely reducing the GPU memory costs.
So if you adopt the same training receipts with AdderNet, it should perform similarly.
have you reproduce the resnet20-cifar10 results(91.8) in the original addernet paper, using your cuda version code?