Decompose AvgPool3d Op to Conv3d for Support

kamalrajkannan78 commented 1 week ago

Summary:

This PR fixes #713
The core idea behind this transformation is to use Conv3d as a replacement for AvgPool3d by setting the convolution kernel's weights to a constant value. Specifically, the weight for each element in the kernel is set to:

weight_value = 1.0 / (kD * kH * kW)
weight = torch.ones(1, 1, kD, kH, kW) * weight_value

This approach effectively turns the convolution into an average pooling operation, as the weights are identical, and their sum over the receptive field will produce the average. (Note : conv3d is not yet supported in MLIR and TTNN)
eval,shape,decompose fucntions and sanity tests are added for conv3d, avgpool3d

Logs

dgolubovicTT commented 1 week ago

@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn.

What is your take in this case @nvukobratTT

nvukobratTT commented 1 week ago

@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn.

What is your take in this case @nvukobratTT

Few notes before more detailed CR:

I'm okay with doing these decompositions as TTNN lacks support for these ops. Mostly because it's questionable when (if) they are going to add those. That said, @kamalrajkannan78 can you open two issues on tt-metal repo with a op support request and like them here? Ideally, we should like to have those ops on TTNN as well.
Can we decompose avg pool 3d using avg pool 2d? Current solution looks mathematically correct, however, it will impose confusions when when someone starts to debug what is happening here. From my side, it seems cleaner to decompose conv3d using conv2d, and doing similar for avg pool 3d using avg pool 2d. What are your thoughts guys?
For avg pool 3d we can use a similar decomposition approach:
- Apply avg pool 2d over on each depth slice (keep intermediate results for each slice)
- Stack those results along depth dim
- Average those results across depth dim

kamalrajkannan78 commented 1 week ago

@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn.

What is your take in this case @nvukobratTT

thanks for the review @dgolubovicTT , @nvukobratTT

Alexnet model faced Encountered unsupported op node type: nn.avg_pool3d, on device: tt issue . I looked into nn.avg_pool2d decomposition and followed similar way to decompose nn.avg_pool3d.
Inorder to support alexnet osmr model avgpool3d is needed & It will be helpful if we get a support from MLIR for this op
If we decompose it with some other available ops , we may need to add support in those available ops to handle our 5d input tensor.

kamalrajkannan78 commented 1 week ago

@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn. What is your take in this case @nvukobratTT

Few notes before more detailed CR:

I'm okay with doing these decompositions as TTNN lacks support for these ops. Mostly because it's questionable when (if) they are going to add those. That said, @kamalrajkannan78 can you open two issues on tt-metal repo with a op support request and like them here? Ideally, we should like to have those ops on TTNN as well.

Can we decompose avg pool 3d using avg pool 2d? Current solution looks mathematically correct, however, it will impose confusions when when someone starts to debug what is happening here. From my side, it seems cleaner to decompose conv3d using conv2d, and doing similar for avg pool 3d using avg pool 2d. What are your thoughts guys?

For avg pool 3d we can use a similar decomposition approach:

Apply avg pool 2d over on each depth slice (keep intermediate results for each slice)

Stack those results along depth dim

Average those results across depth dim

sure @nvukobratTT I will try to decompose avgpool3d in a suggested way!

dgolubovicTT commented 1 week ago

Decomposing avgpool3d to avgpool2d like @nvukobratTT layed out sounds great. Let's do it like that!

kamalrajkannan78 commented 1 week ago

@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn. What is your take in this case @nvukobratTT

Few notes before more detailed CR:

I'm okay with doing these decompositions as TTNN lacks support for these ops. Mostly because it's questionable when (if) they are going to add those. That said, @kamalrajkannan78 can you open two issues on tt-metal repo with a op support request and like them here? Ideally, we should like to have those ops on TTNN as well.

Can we decompose avg pool 3d using avg pool 2d? Current solution looks mathematically correct, however, it will impose confusions when when someone starts to debug what is happening here. From my side, it seems cleaner to decompose conv3d using conv2d, and doing similar for avg pool 3d using avg pool 2d. What are your thoughts guys?

For avg pool 3d we can use a similar decomposition approach:

Apply avg pool 2d over on each depth slice (keep intermediate results for each slice)

Stack those results along depth dim

Average those results across depth dim

nvukobratTT commented 4 days ago

Thanks for the issues @kamalrajkannan78 !

FYI, I updated labels nad added project information to reflect more precisely what our request is :))

kamalrajkannan78 commented 3 days ago

Thanks for the issues @kamalrajkannan78 !

FYI, I updated labels nad added project information to reflect more precisely what our request is :))

Thanks for adding those details @nvukobratTT

kamalrajkannan78 commented 14 hours ago

@kamalrajkannan78 @nvukobratTT Can you share why we decompose avgpool3d? Since tt-metal has avgpool2d, can we expect it to have avgpool3d support, or ask for it? Do we know if they had the need for avgpool3d themselves? This decomposition is logical but whenever I see some decomposition lately, I start thinking about how performant decomposed subgraph is compared to native op in ttnn. So if we do decompose, we are less overfitting to ttnn, and if we don't do it and require ttnn support to lower directly, we are maybe overfitting to ttnn. What is your take in this case @nvukobratTT

Few notes before more detailed CR:

I'm okay with doing these decompositions as TTNN lacks support for these ops. Mostly because it's questionable when (if) they are going to add those. That said, @kamalrajkannan78 can you open two issues on tt-metal repo with a op support request and like them here? Ideally, we should like to have those ops on TTNN as well.

Can we decompose avg pool 3d using avg pool 2d? Current solution looks mathematically correct, however, it will impose confusions when when someone starts to debug what is happening here. From my side, it seems cleaner to decompose conv3d using conv2d, and doing similar for avg pool 3d using avg pool 2d. What are your thoughts guys?

For avg pool 3d we can use a similar decomposition approach:

Apply avg pool 2d over on each depth slice (keep intermediate results for each slice)

Stack those results along depth dim

Average those results across depth dim

https://github.com/tenstorrent/tt-forge-fe/pull/741

tenstorrent / tt-forge-fe

Decompose AvgPool3d Op to Conv3d for Support #714

Summary:

Logs