the current solution still uses 16-bit counters, with custom sorting logic that can withstand counter overflow/wraparound. a cleaner solution would involve extending counters to 64 bits, which would be enough for apps including up to 2^64 tasks per rank-to-rank pair; however, this would involve more significant redesign of the code.
includes misc cleanup to improve usability/debuggability
as of https://github.com/m-a-d-n-e-s-s/madness/commit/3c3c2ba6a71c80a56a68f6e44c6f76f5543d18f5#diff-a10540bf42d111c837a8df49188e5ddb09cc03a68ecf0d444a55c10b09797985 RMI message counters have been 16 bits long; recent apps using MADWorld via https://github.com/ValeevGroup/tiledarray exceed this limit and cause hang due to the broken in-order queue processing, leading to hangs due to unprocessed messages.
the current solution still uses 16-bit counters, with custom sorting logic that can withstand counter overflow/wraparound. a cleaner solution would involve extending counters to 64 bits, which would be enough for apps including up to 2^64 tasks per rank-to-rank pair; however, this would involve more significant redesign of the code.
includes misc cleanup to improve usability/debuggability