Open KyleM-Irreversible opened 5 months ago
Thank you for reporing this issue. We will try to reproduce this and fix it.
Can you try: TorchInferenceRPUConfig
instead of InferenceRPUConfig
?
Hi @KyleM-Irreversible ,
thanks for reporting this. Indeed, re-using the same layer for two different sized inputs is currently not supported. You can try to use the TorchInferenceRPUConfig
(@jubueche suggested), which implements a subset of features of the InferenceRPUConfig
purely in torch
instead of relying on the RPUCuda
library. It might work in the case of re-using a layer with different sizes as it computed the backward pass differently.
@KyleM-Irreversible have you tried the new ways that @maljoras and @jubueche recommended? Please let us know soon. Thanks!
Description
I have a fairly simple convolutional neural network with two distinct convolutional layers. I would like to create a three layer convolutional network by applying the second convolutional layer two times, downsampling between the layers (using AvgPool2d). Here is a diagram of my network architecture:
(Note the two yellow layers are the same "layer", just applied two times. This is done to reduce the number of parameters.)
When I convert my model to analog using
convert_to_analog()
, it works fine in the forward pass but gives me the following error upon calling.backward()
:RuntimeError: Function AnalogFunctionBackward returned an invalid gradient at index 1 - got [256, 16, 16, 16] but expected shape compatible with [256, 16, 32, 32]
This does not occur on CPU, only GPU. Also, the original "digital" model works fine on both GPU and CPU. If I remove the "downsampling" layer (i.e. remove the AvgPool2d between the two convolutional layers), it works in all cases.
How to reproduce
Here is a minimum working example:
Expected behavior
The above example should run and train both the analog and digital versions of the model.
Other information