tenstorrent / tt-forge-fe

The TT-Forge FE is a graph compiler designed to optimize and transform computational graphs for deep learning models, enhancing their performance and efficiency.
https://docs.tenstorrent.com/tt-forge-fe/
Apache License 2.0
21 stars 4 forks source link

Stack op failing for test case with dim(-2 and 3) for 4d inputs #615

Open meenakshiramanathan1 opened 4 weeks ago

meenakshiramanathan1 commented 4 weeks ago

Description

Stack op sanity test cases with dim =-2 and dim = 3 is failing with RuntimeError: TT_ASSERT @ /proj_sw/user_dev/mramanathan/OCT23_forge/tt-forge-fe/forge/csrc/graph_lib/shape.cpp:140: (i >= 0) && (i < (int)dims_.size()) E info: E Trying to access element outside of dimensions: 4 in optimised graph stage

"params",
    [
        ([(1, 30, 30, 16), (1, 30, 30, 16)], -2),
        ([(1, 30, 30, 16), (1, 30, 30, 16)], 3),
    ],

Above issue was arising from erase_inverse_ops.cpp, more specifically from commute_through_concat function in commute_utils.cpp. In this line, clone_shape = [1, 30, 30, 1, 16] and commute_shape = [1, 30, 30, 16] and matching_in_commute_shape = 3 and concat dim is calculated here as 3 (op->shape().size() =5 and current dimsenion = -2) , since both these are equal concat_commute_up is set to true here.

new_dim is being calculated in can_commute_through_dim which further calls can_commute_reshape_through_dim ,where input shape is [1, 30, 30, 16] and output shape is [1, 30, 30, 1, 16], since concat_commute_up is true, input and output shapes gets swapped.

After swapping, new_dim is being set to 4 and can_commute is set to true after volumes are checked with these conditions.

This condition is evaluated to true since can_commute is true and new_dim is not equal to -1 and when updated_commute_shape[new_dim] = op->shape()[concat_dim]; is trying to be accessed inside this if condition, out of bounds error is thrown as new_dim = 4 and updated_commute_shape = [1, 30, 30, 16]

To skip out of bounds test case, this condition is added in can_commute_reshape_through_dim function.

if ((volume_above(input_shape_vec, i) == volume_above(output_shape_vec, dim)) and
     (volume_below(input_shape_vec, i) == volume_below(output_shape_vec, dim)) and (i < output_shape_vec.size()))

Reproduce

git checkout mramanathan/stack_op_sanity
git submodule update --recursive
cmake --build build -- install_ttforge
pytest forge/test/mlir/test_ops.py::test_stack -vss

Observed Behaviour

RuntimeError: TT_ASSERT @ /proj_sw/user_dev/mramanathan/OCT23_forge/tt-forge-fe/forge/csrc/graph_lib/shape.cpp:140: (i >= 0) && (i < (int)dims_.size())
E       info:
E       Trying to access element outside of dimensions: 4` in optimised graph 
nvukobratTT commented 4 weeks ago

By this PR:

And this issue seems like we have more bugs with the erase inverse ops pass.

What happens when you just disable erase inverse op pass?

meenakshiramanathan1 commented 4 weeks ago

By this PR:

And this issue seems like we have more bugs with the erase inverse ops pass.

What happens when you just disable erase inverse op pass?

@nvukobratTT , after disabling erase inverse op pass, the same out of bounds issue is coming from the next pass i.e insert_inverse_on_outputs

E       RuntimeError: TT_ASSERT @ /proj_sw/user_dev/mramanathan/lab89_nov3_forge/tt-forge-fe/forge/csrc/graph_lib/shape.cpp:135: (i >= 0) && (i < (int)dims_.size())
E       info:
E       Trying to access element outside of dimensions: 4
E       backtrace:
E        --- tt::graphlib::Shape::operator[](int)
E        --- tt::passes::commute_through_concat(tt::graphlib::Graph*, tt::graphlib::OpNode*, tt::graphlib::OpNode*, tt::graphlib::Node*, tt::graphlib::Shape*, tt::graphlib::Shape*, bool, bool*, std::pair<int, int>*, tt::graphlib::OpType*, bool)
E        --- tt::passes::can_commute_past_op(tt::graphlib::OpNode*, tt::graphlib::OpNode*, tt::graphlib::Graph*, tt::graphlib::Shape*, tt::graphlib::Shape*, bool, tt::graphlib::Node*)
E        --- tt::passes::find_commutable_output_edge(tt::graphlib::Graph*, tt::graphlib::OpNode*, tt::graphlib::Shape, tt::graphlib::OpNode*, tt::graphlib::OpNode*)
E        --- tt::passes::insert_inverse_on_outputs(tt::graphlib::Graph*)
E        --- tt::run_optimization_graph_passes(tt::graphlib::Graph*)