This pull request includes changes to the bitblas library and its associated tests. The most significant changes include enabling debug output in QuickStart.md, modifying the forward method in python/bitblas/module/__init__.py and python/bitblas/ops/general_matmul_splitk.py, and adjusting the testing scripts testing/python/operators/test_general_matmul_fp8.py and testing/python/operators/test_general_matmul_splitk_ops.py.
Debug output:
docs/QuickStart.md: Enabled debug output in three examples using bitblas.set_debug_level("Debug"). [1][2][3]
Codebase modifications:
python/bitblas/module/__init__.py: Modified the forward method to include a stream variable and a stream_handle variable, which is passed to the lib.call method.
python/bitblas/ops/general_matmul_splitk.py: Adjusted the forward method to change the shape of the output tensor, create a new sk_output tensor, and use the torch.sum method to populate the output tensor. [1][2][3]
testing/python/operators/test_general_matmul_splitk_ops.py: Made several changes to the test methods, including adding a SplitK parameter, replacing the get_codegen_result method with a comparison of output_bitblas and output_torch, and adding a map_torch_type method to map input types to torch types. [1][2][3]
This pull request includes changes to the
bitblas
library and its associated tests. The most significant changes include enabling debug output inQuickStart.md
, modifying theforward
method inpython/bitblas/module/__init__.py
andpython/bitblas/ops/general_matmul_splitk.py
, and adjusting the testing scriptstesting/python/operators/test_general_matmul_fp8.py
andtesting/python/operators/test_general_matmul_splitk_ops.py
.Debug output:
docs/QuickStart.md
: Enabled debug output in three examples usingbitblas.set_debug_level("Debug")
. [1] [2] [3]Codebase modifications:
python/bitblas/module/__init__.py
: Modified theforward
method to include astream
variable and astream_handle
variable, which is passed to thelib.call
method.python/bitblas/ops/general_matmul_splitk.py
: Adjusted theforward
method to change the shape of theoutput
tensor, create a newsk_output
tensor, and use thetorch.sum
method to populate theoutput
tensor. [1] [2] [3]Testing script adjustments:
testing/python/operators/test_general_matmul_fp8.py
: Commented out the call tobitblas.testing.main()
and added a call totest_matmul_torch_forward_weight_dequantize
.testing/python/operators/test_general_matmul_splitk_ops.py
: Made several changes to the test methods, including adding aSplitK
parameter, replacing theget_codegen_result
method with a comparison ofoutput_bitblas
andoutput_torch
, and adding amap_torch_type
method to map input types to torch types. [1] [2] [3]