NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
260 stars 51 forks source link

SdpaFwdOp::toString doesn't print all outputs. #3037

Open wujingyue opened 3 weeks ago

wujingyue commented 3 weeks ago

https://github.com/NVIDIA/Fuser/blob/29a59a0a7bd67c83b0d430871351994af53a1858/csrc/ir/nodes.cpp#L4348

This makes debugging harder because I was seeing a fusion output TensorView defined by nothing.

wujingyue commented 3 weeks ago

For example,

T55_g___bfloat[ bS163{1}, iS164{96}, iS165{2048}, iS166{128} ]
   = sdpa(T52_l___bfloat[ bS149{1}, iS151{96}, iS150{2048}, iS152{( ceilDiv(12288, 96) )} ],
            T50_l___bfloat[ bS139{1}, iS141{96}, iS140{2048}, iS142{( ceilDiv(12288, 96) )} ],
            T54_l___bfloat[ bS159{1}, iS161{96}, iS160{2048}, iS162{( ceilDiv(12288, 96) )} ],
            dropout_p = double(0.10000000000000001),
            is_causal = true  )

T56 is also defined by this Expr but not being printed.