latos / wave-protocol

Automatically exported from code.google.com/p/wave-protocol
0 stars 0 forks source link

WIAB client's network connection logic is racy #126

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago

The behaviour of auto-open of the first wave in a wave list (perhaps we should 
just remove that behaviour, once someone can reverse engineer the index wave 
code) can trigger socket communication before the connection is established.

I have a simple fix ready, which just ignores list-selection events if the 
socket is not connected, but there are a number of fixes required.  The proper 
fix involves:
 * being clear that the network's DISCONNECTED state means transient disconnection; i.e., the client will try to reconnect if that state occurs.  A terminal CLOSED state should be added if we ever want the client not to reconnect.
 * adding in backoff for reconnection attempts.
 * client's socket layer should queue client messages (rather than die) when in a transient disconnected state.  Should still die if the client tries to send a messsage in a CLOSED state, if that state gets added.

Original issue reported on code.google.com by hearn...@google.com on 27 Oct 2010 at 5:27

GoogleCodeExporter commented 8 years ago
Hi David. 
If I understand it correctly, the index wave code was already removed. Does it 
affect somehow this issue?

Original comment by vega113 on 25 Apr 2011 at 11:02

GoogleCodeExporter commented 8 years ago
The problem was that the communication channel was flaky (RPCs issued while not 
in a connected state caused errors), so I don't think the index-wave removal 
would not have fixed that.

The problem was revealed to me in the example above, where the old list panel 
would try to open the first wave in the list before the [web]socket[.io] was 
connected.

If the new search panel doesn't trigger wave-open on load, then the problem is 
lessened, but I think the underlying socket issue is still there.

Original comment by david.he...@gmail.com on 26 Apr 2011 at 8:42

GoogleCodeExporter commented 8 years ago
But what about the TODO at line 110 in StageTwoProvider? Can it be addressed 
now? I mean, can the delay of 200 ms in the wave loading be removed now? It 
seems to me that it can.

Original comment by vega113 on 26 Apr 2011 at 10:48

GoogleCodeExporter commented 8 years ago
Actually, looks like this bug was probably fixed by patch 691:73cdebfab016

"Adds message queuing in WaveWebSocketClient so that rpcs issues while 
disconnected do not crash."

So yes, my guess is that the TODO can now be addressed, and the delay removed.

The true test is Firefox, which uses socket.io, where the connection 
establishment can take a few seconds.  After removing that delay, then if is it 
possible to load WIAB in Firefox, refresh over and over, clicking on waves as 
soon as possible (before the socket.io is connected) and no shiny occurs, then 
all is well.

Original comment by hearn...@google.com on 27 Apr 2011 at 1:49