Open lazear opened 1 month ago
I wasn't able to repro this on my local machine, but does using layout=torch.jagged
work for you? I mention this because our support for jagged layout nested tensors is much better than that for "strided nested tensors", as the former work well with torch.compile
.
Padded conversions recently landed for jagged layout nested tensors in #125947 so if you're using a new enough PyTorch, you can still use to_padded_tensor()
. That said, if you can avoid materializing padded tensors and can stay in nested tensor land, memory usage and speed will generally be better.
🐛 Describe the bug
Hi,
Not sure if I'm using nested tensors incorrectly, but I would like to pad some variable-length sequences and feed the resulting padded tensor into a DataLoader.
This approach was working great while developing a model, but I have been scaling up the input data and was hit with the following issue:
All of the input data is well-formed and properly typed (tensors of size (N,1280)). The negative dimension given corresponds to the summed number of elements in the
tensor_list
(N items dim 0 dim 1) and overflowing from a u32 -> i32 cast.Code to reproduce
Versions
torch==2.4.1+cu124
cc @cpuhrsch @jbschlosser @bhosmer @drisspg @soulitzer @davidberard98 @YuqingJ