In the sequencer, we recently added code that removes 'identity' operations (i.e. those that only change metadata of data, not the data itself, such as Reshape, Identity, etc.). The code does this by looking at the next op and replacing the input it receives from the identity op with the input the identity op receives itself: input_a -> A -> output_a -> B -> output_b becomes input_a -> B -> output_b by telling B to use input_a instead of output_a.
However, the next op we consider is not always the next one in the chain: a model such as A -> B, C -> D, B + D -> E can have order A C B D E. Assuming B is an identity op, when our code considers removing node B it should change node E (to point at the output of A) but it will instead look at node D.
Below is an example from BERT-Squad that shows this behaviour:
A solution could be to look at all nodes ahead to find the one that uses. However, I think this requires a rethinking of the sequencer's fundamental assumption that the node sequence is the right unit of analysis...
Describe the bug
In the sequencer, we recently added code that removes 'identity' operations (i.e. those that only change metadata of data, not the data itself, such as
Reshape
,Identity
, etc.). The code does this by looking at the next op and replacing the input it receives from the identity op with the input the identity op receives itself:input_a -> A -> output_a -> B -> output_b
becomesinput_a -> B -> output_b
by tellingB
to useinput_a
instead ofoutput_a
.However, the next op we consider is not always the next one in the chain: a model such as
A -> B, C -> D, B + D -> E
can have orderA C B D E
. Assuming B is an identity op, when our code considers removing node B it should change node E (to point at the output of A) but it will instead look at nodeD
.Below is an example from BERT-Squad that shows this behaviour:
A solution could be to look at all nodes ahead to find the one that uses. However, I think this requires a rethinking of the sequencer's fundamental assumption that the node sequence is the right unit of analysis...