operate-first / blueprint

This is the blueprint for the Operate First Initiative
GNU General Public License v3.0
16 stars 16 forks source link

CPU pressure - Shared resources guide #83

Closed tumido closed 2 years ago

tumido commented 3 years ago

Create a guide how to be respectful and mindful of shared resources and what CPU request and CPU limit means. We're in a constant state of CPU pressure, while the utilization is never above 20%.

tumido commented 3 years ago

Based on investigation:

New user-focused dashboards (convert to declarative and provide upstream):

oindrillac commented 3 years ago

@tumido I get a 403 permission denied on these dashboards as i try to login through moc-sso. Can you please provide access?

HumairAK commented 3 years ago

currently only people with get access to opf-monitoring can access grafana (i.e. people in the operate-first ocp group), due to the change here -- we should expand these to additional users, (subject to ongoing discussions regarding user policies of course)

tumido commented 3 years ago

@HumairAK can we extend that to data-science group then? :slightly_smiling_face:

oindrillac commented 3 years ago

thanks! if these are the user oriented dashboards which @tumido presented, we(data science users) would very much benefit from having access to these 🙂

tumido commented 3 years ago

yes, these are the user oriented dashboards (the two linked at the end) - we'll convert them to permanent dashboards soon. I still have some fiddling to do. Right now, they are just prototypes, stored in the Grafana runtime..

sesheta commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

HumairAK commented 3 years ago

/remove-lifecycle stale

sesheta commented 2 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

sesheta commented 2 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

HumairAK commented 2 years ago

/remove-lifecycle rotten

sesheta commented 2 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

sesheta commented 2 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

sesheta commented 2 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close

sesheta commented 2 years ago

@sesheta: Closing this issue.

In response to [this](https://github.com/operate-first/blueprint/issues/83#issuecomment-1207527151): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.