jax-ml / jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
http://jax.readthedocs.io/
Apache License 2.0
30.59k stars 2.82k forks source link

[Mosaic TPU] Support packed type matmul with arbitrary shapes. #25068

Open copybara-service[bot] opened 3 days ago

copybara-service[bot] commented 3 days ago

[Mosaic TPU] Support packed type matmul with arbitrary shapes.

We only need to mask out subelement on contracting dim. Instead of unpacking data and applying masks, we create a VREG-sized i32 "mask" which contains subelement mask info to logical and with target vreg. Through this way, in order to mask sub-elements, each target vreg only needs to apply 1 op (logical_and) instead of 3 ops (unpacking + select + packing).