ROCm / bitsandbytes

8-bit CUDA functions for PyTorch
MIT License
34 stars 3 forks source link

Remove blocksize 64 for quant/dequant functions #10

Closed pnunna93 closed 6 months ago

pnunna93 commented 6 months ago

This PR removes 64 blocksize for quantize and dequantize functions, as ROCm warpsize doesn't support that case.

It also skips that case for tests which use quantize/dequantize functions. These are the tests enabled with this PR:

test_autograd.py::test_matmul_fp8 test_functional.py::test_dynamic_blockwise_quantization test_functional.py::test_4bit_compressed_stats