Determinism (Message Order)

ptheywood commented 3 years ago

Completely deterministic simulations are probably required by some users.

Currenlty atomics are only used during message datastructure construction for non-brute force messages (i.e. pbm construction for bucket messaging, spatial etc).

This is mainly just because counting sort leads to the PBM being constructed as a side-effect (the scan) so is cheaper overall than non-counting sort + explicit pbm consturction.

It should be possible to make this deterministic without too much cost, via a stable counting sort. This should also have a side effect of improving the memory access pattern in the scatter (writes) on average.

Might be worth keeping both stable and non-stable options and letting the user decide based on their requirements (i.e. a simulation parameter for exact determinism?)

Should be possible to stil use the atomics for bin counts, but the position within the bin which is currently the return value from the atomicInc will need to be adjusted to make it stable. This would be easiest in an separate kernel, but it should be possible to fuse it into another kernel (maybe the prefix sum if we rolled our own?)

Edit (2023-10-05):

Potential sources of determinism at the time of writing this edit (see below for more details):

Atomic counting sort
Atomics in Macro Environment Properties
RNG state not associated with agents (not a contributor? but will cause cascade differences)

ptheywood commented 1 year ago

Recently encountered a subtle bug in a message iteration loop which was hard / time consuming to debug due to the non-determinism of the message loop, so this feature definately would be useuful.

For it to be available in distributed pyflamegpu though it would need to be runtime configurable somehow. Perhaps a flag on the message list which then triggers runtime dispatch of the correct path, or templated instanciation of an object which performs the pbm construction.

Something to consider implementing in the future atleast.

ptheywood commented 9 months ago

Macro Environment Properties also rely on atomics, so their use will likely result in non-deterministic simulations.

Not aware of a sensible alternative that would be usable from within an agent function for mutating these properties in a deterministic way due to undefined order of operation (Would probably need to use agents and messages instead).

RNG states also do not follow agents, states are in a fixed order, using the first N for all agents currently in flight. True determinism might require agent states be fixed to an agent. This itself shouldn't be non-deterministic, but might exaccerbate the issue if agents change order / state due to a non-deterministic outcome.

This is probably worth its own separate, but potentially related issue.

Robadob commented 9 months ago

This itself shouldn't be non-deterministic, but might exaccerbate the issue if agents change order / state due to a non-deterministic outcome.

Agents are now sorted for spatial messaging, is a stable sort used?

Edit: It uses HostAPI, which uses cub/thrust sort, which I think is stable.

ptheywood commented 9 months ago

Agents are now sorted for spatial messaging, is a stable sort used?

HostAgentAPI::sort_async uses thrust::stable_sort_by_key

Robadob commented 9 months ago

Note: sort is not guaranteed to be stable.

https://thrust.github.io/doc/group__sorting_ga3f47925d80f4970d5730051dba1c5603.html

FLAMEGPU / FLAMEGPU2

Determinism (Message Order) #417