bcgov / DITP-DevOps

Digital Identity and Trust Program Team's DevOps Documentation Repository
Apache License 2.0
2 stars 5 forks source link

Add SysDig monitoring to Traction namespaces #92

Closed esune closed 10 months ago

esune commented 11 months ago

Add SysDig monitoring to bc0192 namespace set. Follow what already done for this ticket as draft: https://github.com/bcgov/DITP-DevOps/issues/63

Acceptance Criteria:

rajpalc7 commented 11 months ago

Hi @esune - Can you please let me know the who will require access to bc0192 namespace ?

rajpalc7 commented 11 months ago

Assuming Acceptance Criteria is to add dashboard monitoring in dev , test and prod environment and also adding alerts for them

esune commented 11 months ago

Please note: there are bc0192-dev includes pr-based deployments that are temporary. They include pr- in the name and should be excluded from the dashboard.

rajpalc7 commented 11 months ago

@hiteshgh - This ticket was completed on friday - 11th Aug and was moved to review for wade.

WadeBarnes commented 10 months ago

@rajpalc7, I don't see a sysdigteam file for bc0192 here; https://github.com/bcgov/DITP-DevOps/tree/main/sysdig. No PRs for one either.

WadeBarnes commented 10 months ago

@rajpalc7, Where would I find the links to the dashboards? I don't see them listed here with the rest of them; https://trello.com/c/Mj9EqIzq/88-sysdig-dashboards

WadeBarnes commented 10 months ago

@rajpalc7, Were you going to do backups for the dashboards you created for bc0192? I don't see a PR for them or any here; https://github.com/bcgov/DITP-DevOps/tree/main/sysdig/dashboard%20backups

WadeBarnes commented 10 months ago

@rajpalc7, Can you provide a summary and links to the dashboards and alerts so I'm not having to search for and guess at what you did? Thanks.

rajpalc7 commented 10 months ago

@WadeBarnes - I just followed the ticket's acceptance criteria and did what was asked to do there. I can create the links for dashboards and alerts shortly.

rajpalc7 commented 10 months ago

Links for dashboards and alerts are available on https://trello.com/c/Mj9EqIzq/88-sysdig-dashboards now

rajpalc7 commented 10 months ago

https://github.com/bcgov/DITP-DevOps/pull/105/commits - Dashboards backup and sysdig team PR is ready now

WadeBarnes commented 10 months ago

@rajpalc7, All of the alerts for the bc0192-team are looking at the wrong namespace, they all contain a99fd4 in their queries rather than the expected bc0192 namespace. Please review and fix.

WadeBarnes commented 10 months ago

Dashboards:

rajpalc7 commented 10 months ago

Hi @WadeBarnes - We are seeing this issues because looks like they have made changes in the pod names. I think it will be better for me to work on these dashboards once we have something permanent. What do you suggest ?

esune commented 10 months ago

Hi @WadeBarnes - We are seeing this issues because looks like they have made changes in the pod names. I think it will be better for me to work on these dashboards once we have something permanent. What do you suggest ?

What do you mean by changes in the pod names, exactly? Is this about the Tenant UI pods clightly changing the naming pattern, or something else? Can you use k8s labels as selectors, which will likely be more consistent, instead of pod names?

rajpalc7 commented 10 months ago

I mean deployment label names but i did talk to Lucas about it and he says it will remain the same from now on. Hopefully!

rajpalc7 commented 10 months ago

@WadeBarnes - I have made all the necessary changes according to the new deployment label and PVC dashboards now. Ready for review

WadeBarnes commented 10 months ago

Alerts seem to be working now- thanks.

The dashboards are all showing data now and the links to the PVC Dashboard are working - thanks.

The PVC Dashboard needs a few updates though:

image

rajpalc7 commented 10 months ago

Thanks @WadeBarnes PVC dashboard is fixed now. Ready for review

WadeBarnes commented 10 months ago

Looks good. Thanks