Open sabetAI opened 1 year ago
I reproduce this error with non-contiguous data:
model = te.LayerNorm(4)
x = torch.randn([4,4,4,4], device="cuda")
x = x.contiguous(memory_format=torch.channels_last)
x.requires_grad_(True)
y = model(x)
I think it would be reasonable if the TE modules all coerced their inputs to be contiguous, to be on GPU, and to have the expected dtypes.
Even with this support, be advised that non-contiguous data should still be avoided in TE. TE kernels are mostly written for contiguous data and I've found that PyTorch's reordering kernel is slow.
transformer_engine.pytorch.module.layernorm.LayerNorm
callsinputmat = inp.view((-1, in_features))
which throws the error below. Usingreshape
fixes this error.