asynchronics / asynchronix

High-performance asynchronous computation framework for system simulation
Apache License 2.0
170 stars 9 forks source link

Cycle detection #28

Open joshburkart opened 1 month ago

joshburkart commented 1 month ago

Hello, thank you for developing this very interesting project! I am currently trying to evaluate whether it can be used for a set of simulators I am designing. There could eventually be many constituent interconnected models that these simulators would run, which would be developed by a community of engineers, so one concern I have centers on preventing subtle model wiring issues like accidental cycles.

In particular, say there are three models, A, B, and C. What if an engineer accidentally wires them up so that, all within the same timestamp, A sends a message to B, then B to C, and then C to A. I assume this would simply spin indefinitely, right? The way I'd think to avoid this is to delay one of the messages, of course, but I worry this could represent a common "gotcha" when dealing with complex simulations, and ideally I'd like to have a way to error out early and alert the engineer of the problem. Are there any elegant ways to approach this?

Thanks so much!

sbarral commented 1 month ago

Note that it is in general not desirable to prevent cyclic graphs at bench building time as this is most of the time a legitimate action. Cross-connecting model A and model B may not lead to an actual loop during execution because a message sent by A to B may not trigger another message to A, or may trigger a message without causing the cycle to repeat. Sometime, loops are also used in a controlled manner to model an iterative process.

Whether a cyclic graph will actually cause an unintended loop depends on the detailed implementation of the involved models so there is sadly no practical way to prevent this at compile time, but this could be done at run time. At the moment our ability to catch such loops at run time is not that great admittedly, but this will hopefully change soon.

One of the planned mitigation is deadlock detection. Indeed, in most practical cases, an unintended loop will generate a deadlock, for instance: 1) A sends a request to B which sends a request back to A: this will deadlock since A will wait for the response to its request, but will not be able to respond to the request from B until it returns, 2) A sends an event to B causing B to send 2 or more events to A, and A responds in a way that causes an infinite loop. In this case, A's mailbox will soon saturate at maximum capacity, causing a deadlock.

At the moment, such deadlocks are not detected: the step* methods will return on deadlock without any error. But we think we have figured an efficient deadlock-detection algorithm and the plan is to include it in the next version.

That being said, some infinite loops will not cause a deadlock. To catch these, we could introduce an optional timeout parameter to the simulator that would cause the step* methods to return with an error if the simulation is still running after some deadline.

Would deadlock detection and timeout handling be satisfactory for your intended use?

joshburkart commented 1 month ago

Appreciate your response and the detailed context. Yes, deadlock detection and timeout handling both sound like very desirable features. Thanks!

sbarral commented 1 month ago

Perfect, added to the v0.3 milestone. Let's leave it open until then.