ggml_conv_2d input shape

ggerganov / ggml

Tensor library for machine learning

MIT License

10.93k stars 1k forks source link

ggml_conv_2d input shape #954

Open StudyingLover opened 3 weeks ago

StudyingLover commented 3 weeks ago

I am a freshman about GGML, when I read the source code, I notice that the function struct ggml_tensor * ggml_conv_2d

https://github.com/ggerganov/ggml/blob/10e83a412717c20d57ba19f025248e18e43addf3/src/ggml.c#L6889

from the comment, wo can know the input shape of a is [OC，IC, KH, KW] and shape of b is [N, IC, IH, IW] ,but in the function ggml_im2col(ctx, a, b, s0, s1, p0, p1, d0, d1, true, a->type) https://github.com/ggerganov/ggml/blob/10e83a412717c20d57ba19f025248e18e43addf3/src/ggml.c#L6819 we assert GGML_ASSERT(a->ne[2] == b->ne[2]);, which means KH== IH, this not right I think.

I’m not sure if I understood correctly. If I misunderstood, please correct me. If there is an error in the comments, I am willing to submit a PR to fix it.

bssrdf commented 3 weeks ago

The comments (for some reason) use Pytorch dimension convention which is opposite to what GGML has internally. a->ne[2] and b->ne[2] are both IC, the 3rd dimension ( 2nd in Pytorch).

StudyingLover commented 3 weeks ago

Okay, do we need a PR to note this detail?

---Original--- From: @.> Date: Mon, Sep 9, 2024 01:32 AM To: @.>; Cc: @.**@.>; Subject: Re: [ggerganov/ggml] ggml_conv_2d input shape (Issue #954)

The comments (for some reason) use Pytorch dimension convention which is opposite to what GGML has internally. a->ne[2] and b->ne[2] are both IC, the 3rd dimension ( 2nd in Pytorch).

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

JohannesGaessler commented 2 weeks ago

When I recently worked on IM2COL I didn't look at or try to understand the comments but the dimensions of a and b for a two-dimensional convolution should be in order: width, height, number of input channels, and batch size (I'm not 100% sure about the order of the first two). In examples/mnist-common.cpp there are asserts that let you infer at least the externally visible tensor shapes.

StudyingLover commented 2 weeks ago

yes,I used examples/mnist-common.cpp to help with understanding, but I believe that having correct comments is also very important.