Open msaroufim opened 3 months ago
tensor_core_tiled
layout means it's a layout optimized for tensor core int4 tinygemm kernelsAffineQuantizedTensor
, this is how it's used: https://github.com/pytorch/ao/blob/aeee551b15eebeaabf98ffab9a00addc675a12a9/torchao/quantization/quant_api.py#L375, TensorCoreTiledAQTLayout
is not a top level API
Right now what we have is docstrings but they could use work - this came up as @vayuda was looking at extending his bitpacking work to include a notion of scales
torch.ops.aten._weight_int4pack_mm(input_tensor.contiguous(), packed_weight, groupsize, scale_and_zero)
unclear why scale_and_zero
are a single tensorinnerKtiles
is never defined