Closed kamalrajkannan78 closed 14 hours ago
@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn.
What is your take in this case @nvukobratTT
@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn.
What is your take in this case @nvukobratTT
Few notes before more detailed CR:
@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn.
What is your take in this case @nvukobratTT
thanks for the review @dgolubovicTT , @nvukobratTT
Encountered unsupported op node type: nn.avg_pool3d, on device: tt
issue . I looked into nn.avg_pool2d decomposition and followed similar way to decompose nn.avg_pool3d. @kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn. What is your take in this case @nvukobratTT
Few notes before more detailed CR:
- I'm okay with doing these decompositions as TTNN lacks support for these ops. Mostly because it's questionable when (if) they are going to add those. That said, @kamalrajkannan78 can you open two issues on tt-metal repo with a op support request and like them here? Ideally, we should like to have those ops on TTNN as well.
- Can we decompose avg pool 3d using avg pool 2d? Current solution looks mathematically correct, however, it will impose confusions when when someone starts to debug what is happening here. From my side, it seems cleaner to decompose conv3d using conv2d, and doing similar for avg pool 3d using avg pool 2d. What are your thoughts guys?
For avg pool 3d we can use a similar decomposition approach:
- Apply avg pool 2d over on each depth slice (keep intermediate results for each slice)
- Stack those results along depth dim
- Average those results across depth dim
sure @nvukobratTT I will try to decompose avgpool3d in a suggested way!
Decomposing avgpool3d to avgpool2d like @nvukobratTT layed out sounds great. Let's do it like that!
@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn. What is your take in this case @nvukobratTT
Few notes before more detailed CR:
- I'm okay with doing these decompositions as TTNN lacks support for these ops. Mostly because it's questionable when (if) they are going to add those. That said, @kamalrajkannan78 can you open two issues on tt-metal repo with a op support request and like them here? Ideally, we should like to have those ops on TTNN as well.
- Can we decompose avg pool 3d using avg pool 2d? Current solution looks mathematically correct, however, it will impose confusions when when someone starts to debug what is happening here. From my side, it seems cleaner to decompose conv3d using conv2d, and doing similar for avg pool 3d using avg pool 2d. What are your thoughts guys?
For avg pool 3d we can use a similar decomposition approach:
- Apply avg pool 2d over on each depth slice (keep intermediate results for each slice)
- Stack those results along depth dim
- Average those results across depth dim
Thanks for the issues @kamalrajkannan78 !
FYI, I updated labels nad added project information to reflect more precisely what our request is :))
Thanks for the issues @kamalrajkannan78 !
FYI, I updated labels nad added project information to reflect more precisely what our request is :))
Thanks for adding those details @nvukobratTT
@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn. What is your take in this case @nvukobratTT
Few notes before more detailed CR:
- I'm okay with doing these decompositions as TTNN lacks support for these ops. Mostly because it's questionable when (if) they are going to add those. That said, @kamalrajkannan78 can you open two issues on tt-metal repo with a op support request and like them here? Ideally, we should like to have those ops on TTNN as well.
- Can we decompose avg pool 3d using avg pool 2d? Current solution looks mathematically correct, however, it will impose confusions when when someone starts to debug what is happening here. From my side, it seems cleaner to decompose conv3d using conv2d, and doing similar for avg pool 3d using avg pool 2d. What are your thoughts guys?
For avg pool 3d we can use a similar decomposition approach:
- Apply avg pool 2d over on each depth slice (keep intermediate results for each slice)
- Stack those results along depth dim
- Average those results across depth dim
Summary:
This PR fixes #713
The core idea behind this transformation is to use Conv3d as a replacement for AvgPool3d by setting the convolution kernel's weights to a constant value. Specifically, the weight for each element in the kernel is set to:
This approach effectively turns the convolution into an average pooling operation, as the weights are identical, and their sum over the receptive field will produce the average. (Note : conv3d is not yet supported in MLIR and TTNN)
eval,shape,decompose fucntions and sanity tests are added for conv3d, avgpool3d
Logs