Open balisujohn opened 4 months ago
it occurs to me ggml_add
isn't a unary op, so I'd lean towards the ggml_reduce
idea.
There is already ggml_sum_rows()
. A ggml_sum_dim()
should be possible to implement via ggml_permute()
+ ggml_sum_rows()
+ ggml_reshape()
I think, without having to write new kernels
So I need to reduce a 4d tensor along a dimension with the operation addition. I can either add
ggml_add_ext
that lets you specify a dimension for reduction, or I can add a new opggml_reduce
that lets you specify a dimension and an op as an argument (maybe +,/,-,* to start) and reduces along that dimension with that op. Which of these would be preferable?In the meantime, I can implement this in tortoise.cpp with view slices and and a for loop, but I think a inbuilt reduction op will probably be faster.