ggerganov / ggml

Tensor library for machine learning
MIT License
11.25k stars 1.05k forks source link

CUDA: fix 1D im2col, add tests #993

Closed JohannesGaessler closed 1 month ago

JohannesGaessler commented 1 month ago

Fixes https://github.com/ggerganov/ggml/issues/991 .

The problem is that the batch size is stored in ne[3] for 2D but in ne[2] for 1D. The CUDA code on master always retrieves the batch size from ne[3] so the result is incorrect for 1D Im2COL for batch sizes > 1.