codes-org / codes

The Co-Design of Exascale Storage Architectures (CODES) simulation framework builds upon the ROSS parallel discrete event simulation engine to provide high-performance simulation utilities and models for building scalable distributed systems simulations
Other
40 stars 16 forks source link

DFDally Synthetic: Random Permutation Traffic Determinism and Behavior Fixes #211

Closed nmcglo closed 1 year ago

nmcglo commented 3 years ago

This PR fixes a behavioral issue with the random permutation traffic pattern and makes it deterministic.

It fixes the random permutation (--traffic=2) behavior to be in line with what is expected in a random permutation traffic pattern. The previous implementation did not actually do any random permutation as there was no counter to determine how much data had been transmitted to the destination. Consequently there was no conditional to check for said counter to switch to a new destination. Each workload rank picked their own random destination and sent all of their data to it.

The new implementation adds a command line argument --rperm_threshold= which allows the user to set how much data is transmitted to a specific rank before a new rperm destination is chosen by the sending workload rank. This should be, optimally, a multiple of the payload size.