Closed yzhang93 closed 1 month ago
Additionally, most tests should keep using double buffering by default as this is better stress testing the codegen as it leads to more complex code. NOTE: I don't think we have any single buffer tests, which would be good to add to make sure that's also tested.
What tests are you referring to? Do you think it's worth to make single/double buffering a flag and add some e2e tests?
Because the better results you're seeing shouldn't be there in principal, I think we should instead get to the bottom of this instead of merging for marginal gains (with respect to the total gain we still need).
Okay, agreed. I'm going to close this PR.
What tests are you referring to? Do you think it's worth to make single/double buffering a flag and add some e2e tests?
The main e2e tests in run_matmul_tests.sh
. Yes, I think it would be useful to have a flag for that and add some tests for it.
Also, this would still be useful as well to get optimal tile sizes for both double buffer and single buffer:
We should still take the initType into account to be safe. To improve the tile sizes safely, while also making them as large as possible, we should compute it based on all three (lhs, rhs, init) and whether or not double buffering is enabled for L1.
Temporarily set default buffer depth as 1 for l1/l2 space, and use a larger tile size. This will give much better performance. Will enable it after optimizing double buffering and pipelining.