Closed cmelone closed 1 month ago
Make me a reproducer on sapling. You guys had two months to test this. Why are you just reporting it now?
It doesn't reproduce on sapling i.e. ChannelFlow, 8x2x2, 4 nodes, GPUs, debug mode using HTR Develop branch commit fbaf5141, legion commit 12d5a56fe.
I also cannot reproduce on sapling as well as Lassen. The cluster I found the error on is down this week so I will need to check again once it's back up.
This error should be deterministic when it does occur. Are we sure we are running exactly the same configuration on both machines?
No longer able to reproduce
Running at 4 nodes, debug mode, with GPUs. This regression was introduced by https://gitlab.com/StanfordLegion/legion/-/commit/12d5a56fe5b07975c2f1d70b4df156fb9c684949
backtrace:
@lightsighter