lf-lang / lingua-franca

Intuitive concurrent programming in any language
https://www.lf-lang.org
Other
220 stars 57 forks source link

Timeout in SpuriousDependency in TS #2293

Open petervdonovan opened 1 month ago

petervdonovan commented 1 month ago

A bug is suspected in the level assignment for federated execution. This is the level assignment which is based on TPO. See https://github.com/lf-lang/lingua-franca/actions/runs/9148420154/job/25150943808?pr=2262. This link will go stale, so here is the relevant part of the logs:

    INFO (reactor-ts): 'id (federationID)' property overridden by command line argument.
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): >>> Initializing
    INFO (reactor-ts): 'id (federationID)' property overridden by command line argument.
    INFO (reactor-ts): >>> Spent (0 secs; 6705920 nsecs) checking the precedence graph.
    INFO (reactor-ts): Connecting to RTI on port: 15045
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): Scheduling federate start for (1716130121 secs; 804353024 nsecs)
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): >>> Initializing
    INFO (reactor-ts): >>> Spent (0 secs; 14012160 nsecs) checking the precedence graph.
    INFO (reactor-ts): Connecting to RTI on port: 15045
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): Scheduling federate start for (1716130121 secs; 804353024 nsecs)
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): 'id (federationID)' property overridden by command line argument.
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): >>> Initializing
    INFO (reactor-ts): >>> Spent (0 secs; 5295872 nsecs) checking the precedence graph.
    INFO (reactor-ts): Connecting to RTI on port: 15045
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): Scheduling federate start for (1716130121 secs; 804353024 nsecs)
    INFO (reactor-ts): ================================================================================
    Hello from passthrough 1
    Hello from passthrough 0
    count is now 1
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): >>> Start of execution: ((1716130121 secs; 804353024 nsecs), 0)
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): >>> Start of execution: ((1716130121 secs; 804353024 nsecs), 0)
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): >>> Start of execution: ((1716130121 secs; 804353024 nsecs), 0)
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): >>> End of execution at (logical) time: (1716130122 secs; 804353024 nsecs)
    INFO (reactor-ts): >>> Elapsed physical time: (1 secs; 3052032 nsecs)
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): RTI socket has closed.
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): >>> End of execution at (logical) time: (1716130122 secs; 804353024 nsecs)
    INFO (reactor-ts): >>> Elapsed physical time: (1 secs; 3648000 nsecs)
    INFO (reactor-ts): ================================================================================
    INFO (reactor-ts): RTI socket has closed.

This could happen because two mutually dependent network reactions were somehow scheduled with the same level. One reason why this would only manifest in the TS target would be because in the C target there are multiple cores, so reactions on the same level are less likely to block each other.

However, it looks like this is a deadlock during shutdown, so it could be a bug in the coordination of shutdown instead of a problem with level assignment. It is possible that the main reason to suspect the level assignment is merely that the purpose of the SpuriousDependency test is to check TPO.