Glass cyclically disconnects in NT4 mode

chauser commented 1 year ago

Transferring this from Discussion #5262 We observed several times over the season that glass would enter a mode where it would connect, disconnect within a second, reconnect, etc. When it occurred it would happen on multiple glass clients (2) connected to the robot at the same time. Switching to NT3 mode ended the cycle. This continued even with the latest WPILib release.

At the time there was no opportunity to gather more data about what was happening and I do not know how to reproduce the problem.

I'm putting it here as a discussion rather than an issue to find out if others have noticed the behavior. I'll dig deeper if it happens again, but we are going to be interacting with the robot a lot less so it may not come up again for us this year.

@PeterJohnson replied:

This has been reported a few times. In general it is more likely to happen when there are a large number of topics (~1000), or a large amount of data per topic, and a constrained network environment (poor wireless). It was more common in earlier releases, as we've made a number of changes to try to address this throughout the season, but it's been too high risk to make the more significant changes required to completely address the underlying issue mid-season.

Fundamentally the cause of this is the amount of data that needs to be sent in response to the initial subscription causes a backlog in the network connection, which results in the server terminating the connection if the backlog doesn't clear (and there's more data to be sent) within ~1 second. Glass creates a bigger challenge for this than other dashboards, because it subscribes not only to the topics, but also to all the "meta" topics (these are topics that describe the clients, publishers, and subscribers for each of the "real" topics), which roughly translates into 3x the number of topics and initial data being sent.

I have a few strategies in mind for addressing the issue, the first one is the real solution.

Rate limit the initial burst of subscription data (e.g. space the transmits out rather than sending it as a big burst, so that other updates have a chance to make it through). This is a little tricky to do because of ordering concerns--we can't send values for a topic until the publish message is sent, we need to make sure that the current value is sent "eventually" if it's not sent due to some other change, and we also don't want to send the current value if a "newer" value is sent in the interim. It's certainly possible to do with the right flags etc, but the complexity of this is why we didn't make a mid-season change.
Change Glass to only subscribe to meta-topics if that information is actually being shown
Add a publisher option to not keep the last value (this then won't send any value until a new value is published)

PeterJohnson commented 1 year ago

Fixed in #5659.

AngleSideAngle commented 9 months ago

Similarly to #5817, my team, and several others (@Bankst @shueja) continues to experience this issue in Glass and OutlineViewer 2024.1.1 on Windows 10.

PeterJohnson commented 9 months ago

Yes, this came back due to a late change. Reopening until that is fixed.