Closed phil329 closed 1 month ago
I just simply add more cases on the code, and it works.
@torch.library.impl("quanto::qbytes_mm", "CUDA")
def qbytes_mm_impl_cuda(activations: torch.Tensor, weights: torch.Tensor, output_scales: torch.Tensor) -> torch.Tensor:
assert activations.ndim in (2, 3, 4)
in_features = activations.shape[-1]
if activations.ndim == 2:
tokens = activations.shape[0]
elif activations.ndim == 3:
tokens = activations.shape[0] * activations.shape[1]
elif activations.ndim == 4:
tokens = activations.shape[0] * activations.shape[1] * activations.shape[2]
# original tokens
# tokens = activations.shape[0] if activations.ndim == 2 else activations.shape[0] * activations.shape[1]
......
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
There is
AssertionError
when i tried the following codes.The qbytes_mm needs activations with the shape of 2 or 3. However, it has the
ndim=4
during the inference ofLuminaText2ImgPipeline
The primary logs are as followings:
The
AssertionError
happens at the line of code