In matching big subgraphs like attention, it is usually the case that we only care about re-routing the in-coming and out-going edges of the subgraph. Therefore, if we have a method to identify these edges, we can proceed to replacement without having to match the exact content of the subgraph.
This strategy can be seen as matching only the "interface" of the subgraph. It would be possible to just specify (a) sub-patterns the characterize the input and output edges, (b) maybe leveraging metadata as well, for this pattern matcher so we limit the search space (aka do match bounded by a known subgraph, e.g. nn.Module).
E.g.
def pattern(a, b):
c = op.Mul(a, b)
d = op.DontCare(c)
e = op.Relu(d)
return e
DontCare can be greedy or something different that can be specified.
In matching big subgraphs like attention, it is usually the case that we only care about re-routing the in-coming and out-going edges of the subgraph. Therefore, if we have a method to identify these edges, we can proceed to replacement without having to match the exact content of the subgraph.
This strategy can be seen as matching only the "interface" of the subgraph. It would be possible to just specify (a) sub-patterns the characterize the input and output edges, (b) maybe leveraging metadata as well, for this pattern matcher so we limit the search space (aka do match bounded by a known subgraph, e.g. nn.Module).
E.g.
DontCare
can be greedy or something different that can be specified.