Open SeanNijjar opened 8 months ago
FYI @jliangTT @pgkeller - not sure where the right ownership is for this but I figured you guys would be a good starting point. This is for improved testing methodology but requires some lower level improvements to get the benefit.
i don't really know which project board to add this . but multi-device seems to be a good place for this to start.
To expand test variability and increase likelihood of catching hangs during testing (particularly for running determinism tests), allow noc apis to, under the hood, introduce artificial delays. These delays should be lightly configurable from host side.
For example, host can provide a fixed delay value, delay per API entrypoint, or small set of random delays (maybe it can store this sequence of delays in L1 to loop through over time.
Here's an example to convey the idea (Note I put the delay at the start, but I think there are usecases for having it at the beginning and end of the function):