Closed enricogior closed 7 years ago
About running only one connection per peer, in that case, we could control it by running only one NotificationAction per peer, so that shouldn't be a big deal, in fact, we did this using custom peer pool policy manager. Or we could also queue them, as Yaron said.
About this issue, I am little bit confused, because we receive only one peerAvailabilityChanged
event, so we should start only one NotificationAction
, but we call multiConnect
4 times. I will try to debug it later on.
@yaronyg According to your answer in #1941, should we then handle it on Node layer or ThaliCore?
@yaronyg to add context to @mlesnic question: in the native layer, both 'multiConnectToPeer()' and 'BrowserManager.connectToPeer()' don't have any mechanism to detect if multiple parallel connections to the same peer have been issued by the caller. So right now the result is unpredictable, it may work or not based on timing. I'm more than willing to add detection and error handling if we decide we should do so.
@enricogior Per this we are explicitly prohibited from ever having multiple outstanding multi-connect calls to the same PeerID. We currently enforce this in a really ham fisted way here. What we are doing is enquing each multiConnect request and then running them one at a time. We will only go to the next one after the first one has finished. In other words, unless there is some horrible bug, there should NEVER be more than one multiConnect outstanding, period. Not just one multiConnect per the same peerID. No, this is more draconian. There should only be one and only one multiConnect outstanding at once.
So it doesn't matter that we first 1000 multiConnects for the same peer (or different peers). One and only one multiConnect request will hit the native layer and the next one won't show up until the first one gets a response.
So what am I missing?
@yaronyg Yess, you're right, no idea how I missed it.. In that case, the issue is different here. @enricogior If you will be able to reproduce it, please provide full logs from all devices to understand what is actually happening here.
@yaronyg @mlesnic you guys are right, I got confused reading the log, the calls to connectToPeer() are done after each one is completed, so the problem is somewhere else.
Turned out the issue was testThaliNotificationClient
, where in few places we create instance of NotificationClient
, start it and do not stop after test. After that we had additional listeners for peerAvailabilityChanged
and instead of firing one NotificationAction
, we fired four. The fix is already on master_fix_ios_replication
.
Using this branch of Thali and this branch of ThaliCore. Running ThaliTest on two devices.
Multiple
Issuing multiConnect for 8A0345A5-97B2-4955-A983-4FB2A1BB75A4
fired in parallel without waiting for the previous call to complete.