Open sammachin opened 2 years ago
@sammachin The first iteration probably only needs to track OOM events. The rest isn't important for this release.
I disagree with the scheduling/timing of this one. We should invest in features that are in one tier and not the other so we're selling a use-case rather than resources. Further: Node-RED is a low-code platform and FlowForge should abstract away from CPU/Memory insights, our customers aren't running hardware, they're integrating software.
This should be a extension to the container driver API so it can be implemented independently for each backend.
Dockroad has support for the Docker Stats endpoint which should be starting point https://docs.docker.com/engine/api/v1.37/#tag/Container/operation/ContainerStats
Will look for K8s and localfs versions
Further: Node-RED is a low-code platform and FlowForge should abstract away from CPU/Memory insights, our customers aren't running hardware, they're integrating software.
K8s metric-server might have enough information https://kubernetes.io/docs/tasks/debug/debug-cluster/resource-metrics-pipeline/
If they're only seeing OOM events, then I think it's too vague to the user. Whilst we should flag these, I also don't see any harm in making the "Capacity/Usage" visible to a user, albeit we may disguise the technical (CPU/Memory) terminology.
@joepavitt Given this is a story and not an epic I'm cautious on the breath of the scope in one iteration. A fully fletched dashboard with data retention for 30 days at least I don't find appealing in terms of scope. The linked APIs from @hardillb all provide insights into current usage, so there's normalisation needed as well?
Taken from our values in the handbook:
Ship the minimun viable change possible. Small changes allow fast feedback loops which in turn can aid in deciding the next minimal viable iteration. Further, iteration naturally splits big problems into small steps, creates positive momentum, and allows to capture value quicker.
Yes, I'm just trying to get a good handle on what the platforms can actually provide. But they will need normalising to be 0-100% of what the stack (in combination with the driver) allows
Given this is a story and not an epic I'm cautious on the breath of the scope in one iteration.
- very fair.
I do think it's worth recording the CPU/Memory usage as an Epic then, could see that being in the product at some point, and being of value, and just because it doesn't fit into a 0.8 - 0.1, I don't think we should throw it away entirely.
This story is specifically about the Back End APIs to get this data from the various containers and does not go into how this may be presented in the UI. Although I do think abstracting away the raw numbers in favor of a % of the allowed limit would be my preferred approach.
In addition this API/Data will be needed for displaying the info to admins in order to run the service, hence the v1 flag
Building a prototype on Docker.
Looking to add the following to the driver's details
method
{
...,
memory: {
used: <value in bytes>,
limit: <value in bytes>
}
}
Got it working on Docker and the start of it working on k8s.
The javascript k8s client we are using will have better support for this in it's next release. https://github.com/kubernetes-client/javascript/pull/848
Draft PRs raised for docker and k8s. Will need to look at the UI for it.
Key points
Epic
223
Description
As a: project owner
I want to: know my projects CPU and Memory use over the last 24hrs/7days/30days
So that: I can judge if I need to upgrade to a higher capacity project type
Acceptance Criteria