Closed mitovskaol closed 3 years ago
Deploy Sysdig Operator to KLAB and ARO (add ticket numbers here ) and test by Jan 4, 2021
Which ARO cluster is the target (platform-services or pathfinder)? There will also be work needed to deploy the sysdig collector agent to one (or both?) ARO clusters before rolling out the Sysdig Access Operator as sysdig is not collecting data on those clusters yet.
As the intent is to deploy the Sysdig access operator to more than 1 production cluster, when is it expected that the additional clusters will have namespace creation managed by the Project Registry? The risk of duplicate namespace collision is lower with pathfinder, silver and klab, but does increase with every cluster we add (until the project license-plates are managed by 1 registry across all clusters.)
@jefkel The duplicate check function in the Registry has been documented here . At this moment the intent is to have Sysdig rolled out in Silver first, we will start planning the roll out to other prod OCP 4 cluster some time after that.
@stewartshea What is the advantage of installing the Sysdig operators in ARO vs in CLAB? Should we have it installed in KLAB and CLAB before Silver or KLAB and ARO-LAB or KLAB and ARO-Pathfinder? Once you response, I will update the plan in the ticket.
The operator needs to exist in every cluster that is monitored by Sysdig. So.. the order doesn't really matter with exception that it should be deployed in all [C/L/ARO]LAB environments before ARO-PATHFINDER or SILVER.
Hey Folks!
(Dianadec) Diana De Cotiis and her team of teams is needing Sysdig to be in play before they can complete their app migrations. Having talked with her moments ago, she's putting up her hand to be one of the first early access teams to Sysdig in the Silver Cluster as to your Implementation Plan.
"Get 2-3 early access teams to try it out for a 2-3 weeks and confirm it is no issues." in the item above.
For some reason I couldn't @ her here, but I got @matthieu-foucault here to keep tabs on this ticket. We're looking at the Week of the 4th piece @stewartshea. She and her team are ideal candidates.
@Dianadec should work to tag her, it just won't appear in the suggestions for some reason
Thanks @juhewitt . Further to our conversation, are the dates outlined in this ticket on track? Specifically, I'm wondering if the week of January 4 is when the team should expect to have access to Sysdig. We're needing some confirmation on the timing given that our dev/test/prod will be out of sync for a bit. cc @matthieu-foucault
To elaborate on Diana's comment re. dev/test/prod being out of sync, that was always the plan to have a short period of time where that would have been the case as we migrate to Silver, which is fine. Having an extended period of time with our environments being out of sync as we wait for Sysdig's availability is a higher risk, as we effectively block our ability to deploy fixes or features to production during that time period. There are ways to mitigate that risk on our end, which we can look into (if needed) once we get confirmation on the timing.
Hi @juhewitt! Just following up on this item. Are there any updates on this issue? Thanks in advance. cc @matthieu-foucault
nothing yet @Maralsotoudehnia. we'll know more as the week progresses during the ending of this sprint. keep an eye on this one with @stewartshea and @mitovskaol
Thanks @juhewitt, @mitovskaol, and @stewartshea! Grateful, as always, for your assistance. This is a significant impediment for our team and business area. We'd appreciate any support to get this unblocked as quickly as possible. Please don't hesitate to reach out with any questions.
I'm actively working on this now and should have a good idea of an estimated timeline by tomorrow; there are a few moving pieces here to get this rolled out at scale.
@Maralsotoudehnia @matthieu-foucault We are targeting Jan 20 for rolling Sysdig out in Silver. If the service becomes available earlier than that, @stewartshea will ping you directly. Thank you for your patience 🙏
Thank you for the update, @mitovskaol! And big thanks to you and @stewartshea for your work on this!
@stewartshea do you have an estimate for this issue?
I've added an estimate of 13 @lukegonis - While this is an epic, I will try to create the corresponding tickets as per the description as well.
PR# https://github.com/bcgov/devhub-app-web/pull/1548 has been opened to add the developer guide to devhub.
@stewartshea I let the CAS and ServiceBC team that Sysdig is now available in Silver and encouraged them to connect with each other (and you) in #devops-sysdig channel. I've also shared the link to the docs in DevHub to help them get started.
This epic documents a plan for making the Sysdig service available to the product teams in the Silver cluster including some of the design decisions that will apply to all clusters in the OCP 4 Platform.
High Level Architecture Description:
Sysdig Monitor Service in Silver cluster comprises of 3 components: 1) Sysdig agent that collects all informaiton about the cluster and integrated apps. Installed in Silver on Dec 11, 2020 . 2) Sysdig operator monitors CRDs created in app namespaces to provision user access to the Sysdig cloud service. Needs to be installed. 3) Sysdig SaaS cloud service provided by the Sysdig vendor that includes tools for building rich dashboards.
Access model for OCP 4 Sysdig Service:
Access to the Sysdig service is controled via a Custom Resource Definition object that must be created in a project tools namespace. This YAML includes a list of user emails that will be given access to the team space (the name is based on the project license plate) in Sysdig Monitor SaaS that includes access to all 4 namespaces in the project set and performance data collected for the namespaces. Sysdig uses KeyCloak SSO for user authentication so user emails accosiated with their KeyCloak accounts must be used in the Sysdig CRD. Provisioning of the team and limiting visibility to only team's namespaces is handled by the Sysdig custom operator. If the YAML is deleted from the tool namespace, the user access to the Sysdig team space is taken away, their dashboard and the team's space in Sysdig SaaS get deleted, this is why the teams are encouraged to back up their dashboards (how-to doc is needed).
Imeplementation Plan: