Open domichae-amazon opened 1 year ago
When an ExchangeContext is ShutDown() while the ExchangeContext is in the middle of sending a message (SendMessage)
I'm not sure what you mean by "ShutDown()" (there is no such API on ExchangeContext), but what exact thing nulled out mExchangeMgr
and how did it get called in the middle of SendMessage
?
Or is the real issue that SendMessage
is being called after the exchange manager has been shut down? @domichae-amazon
That's correct - the issue is that the Exchange Manager was shut down, which in turn clears out some of the state of the Exchange Context which may still be alive.
the issue is that the Exchange Manager was shut down, which in turn clears out some of the state of the Exchange Context which may still be alive.
Why is the exchange context still alive? All exchanges should be closed before the exchange manager shuts down, precisely because of this problem. Who is not doing that, and why not?
What is the exact sequence of events that got us here?
We discussed this issue more over on the pull request. I did some more digging based on the information that @bzbarsky-apple shared and found the root cause: the darwin
variant of the tv-casting-app
example project does not take the Matter stack lock before any interactions with the Matter SDK. This doesn't match the expectations of the SDK and could theoretically result in crashes in a few different places. The specific event leading to a crash here is that the sample iOS app calls Server::GetInstance().Shutdown()
in a non-thread-safe way (from the application's main DispatchQueue). I'll update the content of the bug to indicate the real issue and create a fresh pull request that addresses it.
Reproduction steps
Problem
The
darwin
version of thetv-casting-app
doesn't take the SDK lock prior to interacting with the SDK. When thetv-casting-app
is backgrounded, it attempts toShutdown
theServer
in a non-thread-safe manner. This eventually results in a crash if theExchangeContext
is in the middle of sending a message (SendMessage
), when this happens due to a segmentation fault on accessingGetSessionManager()->SystemLayer()
, whereGetSessionManager()
returnsnullptr
.Impact
The
Darwin
version of the TV casting example crashes.Bug prevalence
Sometimes
GitHub hash of the SDK that was being used
02bc67e245dbc0e85b3ab772c1f81eb27b953420
Platform
darwin
Platform Version(s)
N/A
Anything else?
ExchangeContext fault location