intel / intel-extension-for-tensorflow

Intel® Extension for TensorFlow*
Other
317 stars 40 forks source link

FusedBatchMatMul precedence over FusedMatMul #62

Closed SriAlavandar closed 7 months ago

SriAlavandar commented 9 months ago

I am running Hugging Face OPT 350M model with ITEX 2.14

As part of Intermediate block, when Relu/Gelu Fusion is being triggered I noticed that resultant op type is of FusedBatchMatMul instead of FusedMatMul (where postops would be Gelu Approximate/Exact/Relu).

Input Graph:

image

Resultant Graph:

image

Attaching the reference where Graph level Fusion is being taken place for this - https://github.com/intel/intel-extension-for-tensorflow/blob/d8fe3daa49f81767c1dd783325c330a145d945bd/itex/core/graph/remapper/remapper.cc#L912

Any specific reason why FusedBatchMatMul precedence is chosen over FusedMatMul?

jianyizh commented 9 months ago

In older version of keras, they use tensordot to implement dense layers. For tensors with rank > 3, it will do the reshape and use MatMul, i.e. [a,b,c] -> reshape [a*b,c] -> matmul [a*b,d] -> reshape [a,b,d] -> bias -> activation. We use BMM so that we do not need the reshape op. It's also easier to deal with other post ops in future. I think current keras 3.0 also uses BMM.