the indices produced by argwhere and squeeze infer the wrong shape of the _op.advindex.
For example, given a tensor data with shape (1, 5053, 1, 3798) and a boolean mask valid_mask with shape (1, 5053)
Expected behavior
The expected shape of data[valid_mask] should be inferred as (?, 1, 3798)
Same as the behavior of data[valid_mask] in Pytorch
Same as the behavior of data[torch.squeeze(torch.argwhere(valid_mask), [1]).tolist()] in Pytorch
Actual behavior
The actual shape is inferred to be (?, 2, 5033, 1, 3798)
Same as the behavior of data[torch.squeeze(torch.argwhere(valid_mask), [1])]
Environment
PyTorch 1.13
tvm >= 1.11.0
Steps to reproduce
The example for PyTorch, just for referring to the similar behavior
# in PyTorch 2.0 because `argwhere` is available till this version
data = torch.rand(1, 5033, 1, 3798).cuda()
valid_mask = torch.rand(1, 5033).cuda() > 0.3
masked_data0 = data[valid_mask]
masked_data1 = data[torch.squeeze(torch.argwhere(valid_mask), [1]).tolist()]
print("shape of data[valid_mask]", masked_data0.shape)
print("shape of data[torch.squeeze(torch.argwhere(valid_mask), [1]).tolist()]”,masked_data1.shape)
print("shape of data[torch.squeeze(torch.argwhere(valid_mask), [1])]”,masked_data1.shape)
The example for tvm frontend.
import torch
import torch.nn as nn
import tvm.relay as relay
class Demo(nn.Module):
def __init__(self):
super().__init__()
def forward(self, x, y):
x = torch.clamp(x, 0.2, 0.8)
mask = y[:, :, 0, 0]
input = x[mask]
ret = torch.squeeze(input, 1)
return ret
if __name__ == "__main__":
shape_list = [("x", (1, 5033, 1, 3798)), ("y", (1, 5033, 1, 1))]
model = DemoNet().cuda().eval()
inputs = []
inputs.append(torch.rand(shape_list[0][1]).cuda())
inputs.append(torch.rand(shape_list[1][1]).cuda()>0.3)
Add some log in function convert_operators in python/tvm/relay/frontend/pytorch.py
Hi,
have you solved this problem?
when do data[valid_mask] = value_tensor (index_put) in pytorch, it will use scatter_nd in tvm.
I also meet similiar problem.
aten::index doesn't work properly with a boolean mask when the boolean mask is in the type of tvm.relay.Call.
indices_list.append(_op.squeeze(_op.transform.argwhere(inp), axis=[1]))
the indices produced by
argwhere
andsqueeze
infer the wrong shape of the_op.advindex
.For example, given a tensor
data
with shape(1, 5053, 1, 3798)
and a boolean maskvalid_mask
with shape(1, 5053)
Expected behavior
The expected shape of data[valid_mask] should be inferred as (?, 1, 3798)
Same as the behavior of
data[valid_mask]
in PytorchSame as the behavior of
data[torch.squeeze(torch.argwhere(valid_mask), [1]).tolist()]
in PytorchActual behavior
The actual shape is inferred to be (?, 2, 5033, 1, 3798)
Same as the behavior of
data[torch.squeeze(torch.argwhere(valid_mask), [1])]
Environment
PyTorch 1.13 tvm >= 1.11.0
Steps to reproduce
The example for PyTorch, just for referring to the similar behavior
The example for tvm frontend.
Add some log in function
convert_operators
in python/tvm/relay/frontend/pytorch.pyThe the shape of relay_out ((?, 2, 5033, 1, 3798)) will be shown in the log.
Triage
cc @shingjan @yelite