Open oljike opened 4 years ago
are you accessing parameter of batch norm? I haven't added per-example functionality for those layers, but will take PR
Could you please give more information on how to properly implement grad1 computation for batchnorm?
I found the following links useful but I'm still missing the big picture:
Hi @yaroslavvb , thank you very much for this package, does this package support GroupNorm, BatchNorm etc. which are commonly used in res net architectures?
Thanks again for your help!
Hi! My question is about computing gradients for each sample in the batch. I reproduced your example with simple neural net, which works exactly fine. However, when using ResNet50 (which has batchnorm, for example) the code produces the following error:
AttributeError: 'Parameter' object has no attribute 'grad1'