This pull request includes changes to versioning and improvements to the handling of different cases in several functions across the bitblas Python package. The version number has been updated in the VERSION file and in the __version__ variable in python/bitblas/__init__.py. The apply_config function in python/bitblas/gpu/gemv.py and the get_vectorize_factor function in python/bitblas/gpu/gemv_dequantize.py have been updated to handle cases where the length of block_info.iters and sch.get_loops(block_b) is 4. In python/bitblas/wrapper/general.py, the legalize_c function has been updated to handle cases where dynamic_symbolic_set is not empty.
Here are the most important changes:
Versioning:
VERSION: Updated version from 0.0.1.dev9 to 0.0.1.dev10.
python/bitblas/gpu/gemv.py: Updated apply_config function to handle cases where the length of block_info.iters is 4.
python/bitblas/gpu/gemv_dequantize.py: Updated get_vectorize_factor function to handle cases where the length of sch.get_loops(block_b) is 4, and added an assertion for SplitK to only support 2D thread config. [1][2]
This pull request includes changes to versioning and improvements to the handling of different cases in several functions across the
bitblas
Python package. The version number has been updated in theVERSION
file and in the__version__
variable inpython/bitblas/__init__.py
. Theapply_config
function inpython/bitblas/gpu/gemv.py
and theget_vectorize_factor
function inpython/bitblas/gpu/gemv_dequantize.py
have been updated to handle cases where the length ofblock_info.iters
andsch.get_loops(block_b)
is 4. Inpython/bitblas/wrapper/general.py
, thelegalize_c
function has been updated to handle cases wheredynamic_symbolic_set
is not empty.Here are the most important changes:
Versioning:
VERSION
: Updated version from0.0.1.dev9
to0.0.1.dev10
.python/bitblas/__init__.py
: Updated__version__
from0.0.1.dev9
to0.0.1.dev10
.Handling of different cases:
python/bitblas/gpu/gemv.py
: Updatedapply_config
function to handle cases where the length ofblock_info.iters
is 4.python/bitblas/gpu/gemv_dequantize.py
: Updatedget_vectorize_factor
function to handle cases where the length ofsch.get_loops(block_b)
is 4, and added an assertion forSplitK
to only support 2D thread config. [1] [2]python/bitblas/wrapper/general.py
: Updatedlegalize_c
function to handle cases wheredynamic_symbolic_set
is not empty.