Closed LaurentMazare closed 1 year ago
Here is an example code for a shape mismatch I'm running into with the current backprop:
use candle_core::{DType, Device, Tensor};
use candle_nn::{conv2d, Conv2dConfig, Module, VarBuilder, VarMap};
fn main() {
let var_map = VarMap::new();
let vb = VarBuilder::from_varmap(&var_map, DType::F32, &Device::Cpu);
let c = Conv2dConfig {
stride: 16,
..Default::default()
};
let projection = conv2d(3, 192, 16, c, vb.pp("projection")).unwrap();
let xs = Tensor::ones(&[1, 3, 224, 224], DType::F32, &Device::Cpu).unwrap();
let out = projection.forward(&xs).unwrap();
out.backward().unwrap();
}
panics on the last unwrap, with backtrace leading to here.
Thanks for reporting this, just to mention that what I think makes it trip is the stride being different from 1. I'm looking at how to fix this though I'm afraid this will require some dilated convolutions that we don't support at the moment.
Just merged some changes that add dilated convolutions, it should get your example to work though I haven't checked the actual gradients as being correct (but I've added a specific test that verifies that some smaller conv2d variant generates the same gradient as pytorch). It's all brand you so certainly take all the outputs with a pinch of salt and let us know if you see anything unexpected (pretty likely that you will run into other shape issues but hopefully it will just be a couple iterations).
No issues thus far and the loss is going down. Thank you for your hard work 🙏
Yay, thanks for testing! Closing this for now but feel free to re-open or open another issue if you spot anything.
Is there a plan to have backprop support for conv1d? Panics on here. Thank you.
It would be nice to have backprop support for conv2d (and conv1d) so as to be able to train convnet architectures.