Open chenqianfzh opened 5 months ago
Hi @chenqianfzh,
Thanks for reporting this issue. I believe that I see the same behavior with the 8bit version matmul
as well.
For the 4bit version, this may work as expected when taking the gemv
path, i.e. inputs where a
is a 1D tensor with size divisible by selected blocksize.
Thanks for checking it out.
I am using this function in model reference, so a needs to high-dimensional tensors. :-(
Hi @chenqianfzh,
Thanks for reporting this issue. I believe that I see the same behavior with the 8bit version
matmul
as well.For the 4bit version, this may work as expected when taking the
gemv
path, i.e. inputs wherea
is a 1D tensor with size divisible by selected blocksize.
Is there a fix planned for this anytime soon @matthewdouglas?
System Info
Linux 20.04
Reproduction
output is defined as a tensor. The following works as expected:
output = matmul_4bit(a, b)
but the following does not, the elements in output are not changed.
matmul_4bit(a, b, out=output)
Expected behavior
matmul_4bit(a, b, out=output)
is expected to have the output values set. It is a preferred way as it saves extra memory copy.