Closed newling closed 2 months ago
Need to understand what 'count' in
-v
-> args.verbose == 1
while -vv
-> args.verbose == 2
.
In general, what I think of when I see us fiddling with this kind of stuff, is we just need to port to pytest
. The only reason I haven't done it yet is because pytest
doesn't handle segfaults well despite claims to the contrary. At least so I recall - it's worth trying out for us. But actually https://github.com/pytest-dev/pytest-xdist/issues/126
gives us the solution: just use catchsegv.sh (or write our own signal handler).
In general, what I think of when I see us fiddling with this kind of stuff, is we just need to port to
pytest
. The only reason I haven't done it yet is becausepytest
doesn't handle segfaults well despite claims to the contrary. At least so I recall - it's worth trying out for us. But actuallyhttps://github.com/pytest-dev/pytest-xdist/issues/126
gives us the solution: just use catchsegv.sh (or write our own signal handler).
Ok. I'm not what 'this kind of stuff' is that you're referring to here. I'm not going to do this anytime soon and would like to maintain this script for as long as we need it.
Ok. I'm not what 'this kind of stuff' is that you're referring to here. I'm not going to do this anytime soon and would like to maintain this script for as long as we need it.
We've been fiddling with this script a lot lately right? Flags, shell out adjustments, changing how the tests are layered. It signals to me that the script is not a substitute for a full-fledged testing framework that gracefully handles these things. In addition a real testing framework could give us other useful features like perf regression tracking via some kind of standardizing logging (compile times, runtimes, etc).
In IREE itself I believe they're revamping the testing and benchmarking framework so I'm waiting until then to see whether we can use it or if we'll need our own.
Just for comparison this is what pytest looks like
https://github.com/makslevental/mlir-python-extras/tree/main/tests
and here's what the test output looks like when everything passes
And when something fails
https://github.com/makslevental/mlir-python-extras/actions/runs/10448085182/job/28927992854
We could even be leveraging the iree/mlir python bindings to do runtime tests rather than shell out
https://github.com/makslevental/mlir-python-extras/blob/main/tests/test_runtime.py
Anyway it's second order stuff but I think it's inevitable we'll need a more serious testing framework.
Ok. I agree that we can do better by using pytest. For now though, can you please accept the PRs I make improving this script? I'm not planning on adding any serious bells and whistles, but this PR fixes a broken test.
Bit the bullet and doing this is parallel https://github.com/nod-ai/iree-amd-aie/pull/752
Ok. I agree that we can do better by using pytest. For now though, can you please accept the PRs I make improving this script? I'm not planning on adding any serious bells and whistles, but this PR fixes a broken test.
My bad I didn't realize you were like actually actually waiting on me to approve. Sorry!
It'd actually be nice to have a line printed for each test that runs. Next step: set up a few levels of verbosity
0: nothing 1: a line for each test run 2: default, sensible 3: debug
Need to understand what 'count' in