Open DeliMm opened 5 months ago
Maybe there is a mistake in the conv's gradient function with the following place
I have a test case with :
- A's shape: (1, 16, 16, 1)
- B's shape: (7, 7, 1, 1)
- stride=3 & padding=2
When I test the above case, there are some errors with
tests/hw4/test_conv.py::test_op_conv[backward-needle.backend_ndarray.ndarray_backend_cuda-Z_shape16-W_shape16-1-0] - ValueError: operands could not be broadcast together with shapes (1,16,16,1) (1,17,17,1)
This function may not have considered the situation of a large convolution kernel during implementation, especially when (H−2p+k)(mods)≠0.
May I ask if it‘s possible to check such case and carry out improvement/perfection? Thanks for your reply.
@PKUFlyingPig @DeliMm Hello! I think the caculation of OFM's size is wrong: https://github.com/PKUFlyingPig/CMU10-714/blob/99180c4ad3986f64e09c4dbe15ff1fb2c113ce63/homework/hw4/python/needle/ops.py#L595 As the pytorch doc says: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d The right size is: (H + 2 P - K) // self.stride + 1* So the caculation of conv gradient has also some problems, like X.grad conv padding
Maybe there is a mistake in the conv's gradient function with the following place https://github.com/PKUFlyingPig/CMU10-714/blob/99180c4ad3986f64e09c4dbe15ff1fb2c113ce63/homework/hw4/python/needle/ops.py#L613
I have a test case with :
When I test the above case, there are some errors with
This function may not have considered the situation of a large convolution kernel during implementation, especially when $(H - 2p + k) \pmod s \ne 0$.
May I ask if it‘s possible to check such case and carry out improvement/perfection? Thanks for your reply.