Open imanehmz opened 8 months ago
It seems that you have done it correctly from looking at your code, just in general the drift time should be given as float not as int. I would think that the issue in your case is that the out_bound setting is much too restrictive (around 5). That would mean that only 5 inputs can be active with weights at gmax before it gets clipped. That is very restrictive. Note that the weights are rescaled to gmax when mapped, as a digital output scale is used in your setting
Thank you for the remark, I converted the times into floats, I increased the out_bound value but it didn't change the result, I tried other rpu_configs from the tutorials available on the aihwkit documentation and the drift is still not being applied to the weights so the issue should be coming from somewhere else
Can you try to use the rpu config that we have in the tutorial to check if the issue persists? link
@imanehmz, did you make the suggested changes to your hw-aware training configuration?
@kaoutar55 Hello, I've used many rpu_configs from the tutorials including that one, but the issue persists
@imanehmz can you please tell us how to reproduced this error. Please share your code or a notebook you tried.
It seems you are using an older version of the aihwkit release. can you use the latest? Look at this example to get the right command: https://github.com/IBM/aihwkit/blob/master/notebooks/tutorial/hw_aware_training.ipynb
Description and motivation
I'm trying to make inference with different timesteps on a neural network that's trained with Feedback Alignment from biotorch, however it is showing the same accuracy for all timesteps. Here's the pseudo-code I'm using :
I'm getting an equal accuracy after the drift for all timesteps, the feedback alignment training method doesn't involve the usage of gradients in the backward pass and uses a random feedback matrix, however, when evaluating the drift, we're in eval mode so it shouldn't be a problem and the noise is going to be applied on the weights of the model and not the gradients to my understanding, why is there no loss in accuracy with the drift?
Proposed solution
It would be great to add support for other types of training methods other than backpropagation drift on analog hardware
Here's the link to a notebook on colab to execute the code of an example