Closed pnunna93 closed 6 months ago
This PR removes 64 blocksize for quantize and dequantize functions, as ROCm warpsize doesn't support that case.
It also skips that case for tests which use quantize/dequantize functions. These are the tests enabled with this PR:
test_autograd.py::test_matmul_fp8 test_functional.py::test_dynamic_blockwise_quantization test_functional.py::test_4bit_compressed_stats
This PR removes 64 blocksize for quantize and dequantize functions, as ROCm warpsize doesn't support that case.
It also skips that case for tests which use quantize/dequantize functions. These are the tests enabled with this PR:
test_autograd.py::test_matmul_fp8 test_functional.py::test_dynamic_blockwise_quantization test_functional.py::test_4bit_compressed_stats