akhenry / openmct-yamcs

Open MCT YAMCS plugin
13 stars 9 forks source link

Open MCT is client rate limiting too eagerly on load #424

Closed akhenry closed 5 months ago

akhenry commented 7 months ago

Summary

The recent implementation of client rate limiting is being triggered fairly consistently on first load in all of our deployment environments. My suspicion is that the WebSocket and remote clock subscriptions happen early enough that UI loading blocks for > 1s causing a buffer overflow and client rate limiting.

Although the warning is somewhat innocuous, it is firing so regularly that users will learn to ignore it and miss genuine rate limiting events.

Expected vs Current Behavior

The Client Rate Limiting warning notification should only appear when the client is under unexpected CPU load, and not during normal application initialization.

Impact Check List

Steps to Reproduce

  1. Load the application in one of our test environments
  2. Observe that a yellow notification appears warning the user about Client Rate Limiting

    Environment

    • Open MCT Version:
    • Deployment Type:
    • OS:
    • Browser:

Additional Information

akhenry commented 6 months ago

Testing notes

Confirm that client rate limiting warning does not appear on first load

  1. Navigate to a complex layout
  2. Reload the application
  3. Confirm that the warning "Telemetry dropped due to client rate limiting" is not displayed

Confirm that all telemetry views are showing real-time telemetry

  1. Create a new display layout
  2. Select a parameter from the tree and add it to the display layout as a:
    1. Alpha-numeric Telemetry View
    2. Plot
    3. Table
    4. LAD Table
    5. Gauge
  3. Confirm that all of the above views of the telemetry value are updating in real-time and showing the same telemetry.

Confirm that Open MCT telemetry resumes after loss of connectivity

  1. Navigate to a complex layout
  2. Confirm that telemetry is flowing
  3. Disable your wifi connection
  4. Enable your wifi connection
  5. Confirm that telemetry resumes flowing with no user input once internet connection is re-established.

Confirm that telemetry continues to flow when tab is inactive

  1. Navigate to a complex layout
  2. Open the same complex layout a second time, but in a separate window. Keep this window in view the entire time.
  3. Confirm that telemetry is flowing in both
  4. In the first window, switch to a different browser tab and wait for 10 seconds or so.
  5. Switch back to the previous tab.
  6. Confirm that:
    1. Telemetry is still flowing
    2. The same telemetry is shown in both the tab and the separate window
    3. No warnings about client rate limiting have appeared in either window.
ozyx commented 5 months ago

Verified -- Testathon 3/14/24 🥧

All the above scenarios passed with flying colors on a very complex display.

davetsay commented 5 months ago

verified testing instructions

I did observe in two windows viewing the same view, the time conductor clock was not ticking on exactly the same timestamps. Maybe a result of the rendering time on each window.

image
akhenry commented 5 months ago

@davetsay Good catch, thank you. I hadn't anticipated this problem. Basically what's happening is that the two windows currently maintain independent workers and subscriptions to Yamcs, so they are both updating the screen on slightly different cycles. While this was always the case, the new 1Hz batching exacerbates the phenomenon and makes it visible to the user, however briefly.

It might be possible to fix this with some effort, but it's probably not trivial. I'd like to treat this as an enhancement, and get some user feedback before we do anything.

davetsay commented 5 months ago

verified.

@akhenry , One observation in this step Confirm that Open MCT telemetry resumes after loss of connectivity. I did get the "Telemetry dropped due to client rate limiting" message on reconnection.