nod-ai / SHARK-ModelDev

Unified compiler/runtime for interfacing with PyTorch Dynamo.
Apache License 2.0
95 stars 48 forks source link

Gather #242

Closed vivekkhandelwal1 closed 4 months ago

vivekkhandelwal1 commented 11 months ago

https://github.com/llvm/torch-mlir/pull/2726

Shukla-Gaurav commented 11 months ago

Hi @stellaraccident ,This is regarding onnx gather op support. It is more general compared to the torch gather op. Onnx Gather op:


If axis = 0, let k = indices[i_{0}, ..., i_{q-1}] then output[i_{0}, ..., i_{q-1}, j_{0}, ..., j_{r-2}] = input[k , j_{0}, ..., j_{r-2}]

while torch gather op requires ranks of all the tensors to be same.

I am still figuring out how to do this conversion and one possible algorithm I came up with seems like:

for iv_tensor of indices: (iterating over indices)
      x = torch.index_select(data, axis, indices[iv_tensor])
      torch.index_put(output, iv_tensor, x) // output[iv_tensor] = x

But not sure how to implement it and if there is another better solution? Could you please guide me on that?

stellaraccident commented 11 months ago

@rsuderman you have a lot of experience with these ops. Can you have a look?

rsuderman commented 11 months ago

So the restriction is a weird one. I cannot see a logical reason why the input and index rank need to be the same. First thought is to squeeze or expand index so that it has the same rank. This will not affect the computational complexity but will make it obey the restriction.

rsuderman commented 11 months ago

Copying from direct chat

Assuming index is of shape (4, 5, 6) and data is (10, 11, 12) we cannot directly perform the gather as we need to grab the full contents of 10 and 12. But we can do a combination of both our ideas. We collapse index into a (120,) unary tensor, materialize our non-axis dimension (1, 120, 1) then broadcast to our non-gather dimensions (in this case axis=1). This would make index be (10, 120, 12). Post the torch.gather we would be left with a (10, 120, 12) result that could be expanded to (10, 4, 5, 6, 12) and transposed to the expected form.

Shukla-Gaurav commented 10 months ago

@rsuderman Can you please review https://github.com/llvm/torch-mlir/pull/2726 ?

AmosLewis commented 5 months ago

Gather failed again in Shark-TestSuites onnx model opt-125M-awq with and

iree candidate-20240610.920

torch-mlir 
commit 7e0e23c66820d1db548103acbdf1337f701dc5a3 (upstream/main)
Author: Sambhav Jain <sambhav.jain@getcruise.com>
Date:   Sun Jun 9 00:32:49 2024 -0700

    Test custom op import with symbolic shapes (#3431)

    Tests the basic constructs of registering a custom op and its abstract
    implementations (with FakeTensors) in python, going through TorchDynamo
    export, followed by importing the shape expressions in the Torch
    dialect.

    Also fixes the importer were previously the symbolic bind op insertion
    was not gated in one place.

python ./run.py --torchmlirbuild ../../torch-mlir/build --tolerance 0.001 0.001 --cachedir ./huggingface_cache --ireebuild ../../iree-build -f onnx -g models --mode onnx --report --torchtolinalg --tests onnx/models/opt-125M-awq

opt-125M-awq.default.torch-onnx.mlir:1123:13: error: 'tensor.expand_shape' op expected number of static shape dims to be equal to the output rank (3) but found 0 inputs instead
    %1119 = torch.operator "onnx.Gather"(%0, %1113) : (!torch.vtensor<[50272,768],f32>, !torch.vtensor<[?,?],si64>) -> !torch.vtensor<[?,?,768],f32> 
            ^
opt-125M-awq.default.torch-onnx.mlir:1123:13: note: see current operation: %1049 = "tensor.expand_shape"(%1046) <{reassociation = [[0, 1], [2]], static_output_shape = array<i64>}> : (tensor<?x768xf32>) -> tensor<?x?x768xf32>