Closed elliottslaughter closed 7 months ago
Does this actually get the wrong answer or is it only Legion Spy complaining? I think there might be a bug in Legion Spy's verification of the application of multiple reduction epochs.
Aside: does the fuzzer even know when we get the wrong answer?
This version of the fuzzer does not do any checking. The initial plan was to use the fuzzer purely to generate "interesting" traces, and rely entirely on Legion Spy to validate if those traces are correct or not.
I could add some validation but it would be inherently incomplete. Because the most interesting cases are the ones that are the hardest to verify (and conversely, the ones I can easily verify would be shocking if Legion ever gets wrong), I don't honestly know if this is worth the effort. Writing a complete validation would amount to a new implementation of Legion Spy, which I am not going to do.
I think there's an easy way to allow the fuzzer to verify the outcome without relying on Legion Spy. Since all the regions being used are small, you can just make shadow versions of them, inline map the shadow regions, and then inline execute the tasks on the shadow regions. Then you can diff the test regions against the shadow regions at the end.
Good idea. I'll work on that.
It will help with differentiating when Legion Spy has a bug or not.
FWIW: there's actually already a comment in Legion Spy about why this test case is failing to verify: https://gitlab.com/StanfordLegion/legion/-/blob/master/tools/legion_spy.py?ref_type=heads#L4963-4971
For what it's worth, I added validation and have been running extensive tests with it, and nothing I've run so far has produced an incorrect answer.
I actually think the runtime analysis that we're testing here is pretty solid. Most of what we're finding are little idiosyncrasies in some of the state machines inside the runtime, but they aren't exactly essential for correctness, more for performance. There are more corners of the runtime to explore, but this particular corner is well explored.
This Legion Spy bug should be fixed with: https://gitlab.com/StanfordLegion/legion/-/merge_requests/1199
This merge request merged and this test now validates.
The fuzzer is generating programs that fail Legion Spy. Example:
Spy log: spy_0.log
Reproduce with this exact version of the Fuzzer: https://github.com/StanfordLegion/fuzzer/commit/5507f40ce17045c6c43d702d7cd8cc01dc72894b