FlowFuse / flowfuse

Build bespoke, flexible, and resilient manufacturing low-code applications with FlowFuse and Node-RED
https://flowfuse.com
Other
265 stars 63 forks source link

Clearer CTA with "High Memory Usage" warnings #4152

Open joepavitt opened 2 months ago

joepavitt commented 2 months ago

Description

See example UX: https://eu.posthog.com/project/2209/replay/01909b5a-3398-7d2e-ad13-4df51078012c

We warn users if they're exceeding 75% CPU utilization on instances. But don't make it clear as to what they can do to remedy the situation.

The user here clicks "Update" next to the instance size/NR version, and presumably thinks this will resolve the problem - but it updates NR version, not the Instance size.

Ideally, we'd point them towards the "Upgrade Instance Size", or at least point them in that direction in the messaging

Which customers would this be available to

Everyone - CE/Starter/Team/Enterprise

Have you provided an initial effort estimate for this issue?

I have provided an initial effort estimate

### Tasks
- [ ] https://github.com/FlowFuse/flowfuse/issues/4188
- [ ] https://github.com/FlowFuse/flowfuse/issues/4193
ZJvandeWeg commented 2 months ago

cc @gstout52

hardillb commented 2 months ago

We do have more data available, to maybe include some charting to help give an indication of if it was a point event or a steady growth over time. The nr-launcher should be collecting some history of the memory values.

joepavitt commented 2 months ago

is that a history of data @hardillb? My thinking is:

hardillb commented 2 months ago

Each instance has it's own promethus data endpoint we can poll.

joepavitt commented 2 months ago

Each instance has it's own promethus data endpoint we can poll.

What extent of data do we get here out of interest?

hardillb commented 2 months ago

Looking at the code, the nr-launcher keeps a rolling average of the memory and CPU usage for the last 5min (sampling every 10 seconds, keeping 30 samples), this is what it uses to trigger the audit log entries.

We can get poll the nr-launcher for the last 1000 samples (~2.7 hours) we could increase this is helpful

The samples are as follows:

{
    "cpu": 0.3374200000007477,
    "ps": 83.44921875,
    "ela": 0.010179761270875764,
    "el99": 0.011206655,
    "hs": 1048576,
    "hu": 248272,
    "ts": 1721203420046
  }
joepavitt commented 2 months ago

Thanks for the details Ben - a "Performance" tab for the Instance, with graphical insight into the performance feels like an obvious win here?

hardillb commented 2 months ago

Rough design:

joepavitt commented 2 months ago

Thanks Ben - I'll open a new issue (as part of the tasklist for this item) and I'll add to the planning board for Nick and I to discuss