Open HariniMohan0102 opened 1 month ago
Maxpool supports only bfloat16. So any bfloat8 inputs must be converted to bfloat16. My guess is that when this conversion happens, there isn't enough memory to fit both input and output tensors.
Yeah, basically you can provide bfp8_b input to the maxpool op. Internally the halo op converts to RM, so BFP16. If input is already BFP16, it should need less memory.
Describe the bug Here is the unit tests of failing maxpool op of each resolution:
When Maxpooling=True, Encoder res: 4094x510
pytest tests/ttnn/unit_tests/operations/test_max_pool2d.py::test_model_net_max_pool_4094x510
When Maxpooling=True, Encoder res: 2047x255
pytest tests/ttnn/unit_tests/operations/test_max_pool2d.py::test_model_net_max_pool_2047x255
To Reproduce Steps to reproduce the behavior:
Expected behavior To run the op for the specific input configurations without error.
Please complete the following environment information: