CUDA error: CUBLAS_STATUS_INVALID_VALUE

This code from the example notebook produces a cuda error when changing the headers,dimensions, etc. Running in Colab:


from fast_transformers.masking import LengthMask, TriangularCausalMask
import torch

model = TransformerEncoderBuilder.from_kwargs(
  n_layers=12,
  n_heads=12,
  query_dimensions=64,
  value_dimensions=64,
  feed_forward_dimensions=3072,
  attention_type="improved-clustered", # this means normal softmax attention
  clusters=20,
  activation="gelu"
).get()

x = torch.rand(
    10,  # batch size 
    100, # sequence length
    128  # feature dimensions
)
model = model.cuda()
print (model)
x = x.cuda()
y = model(x) # calling without masks which means attend to everything
print (y)

RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when callingcublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)``

idiap / fast-transformers

CUDA error: CUBLAS_STATUS_INVALID_VALUE #104