opendatahub-io / kubeflow

Machine Learning Toolkit for Kubernetes
Apache License 2.0
9 stars 30 forks source link

Review KCS Draft SOPs for Jupyter #166

Open maryfrances01 opened 11 months ago

maryfrances01 commented 11 months ago

We are in the process of migrating the RHODS Standard Operating Procedures (SOPs) from our internal GitLab repository to the KCS solutions platform. This migration aims to consolidate all SOPs into one accessible location, promoting collaboration across teams including engineering, SRE, and CEE.

This consolidation ensures that anyone working with a customer, and potentially the customers themselves once solutions mature, will have access to this centralized knowledge repository.

As part of this effort, draft KCS SOPs have been created for the existing SOPs. I kindly request the expertise of an engineer to review the draft KCS SOPs and provide feedback on the following two items:

Accuracy of Commands: Some SOPs in GitLab lacked precise steps, which have been added to the KCS drafts based on my testing. Please verify the accuracy of commands in the KCS drafts and let me know if any of the steps should be changed/improved.

Context, Symptoms, and Impact of Alerts Firing: Provide additional context, symptoms, and impact insights for alerts. If possible, identify dependencies on other components and suggest pre-escalation checks for SRE and CEE teams. Are there any additional checks that SRE or CEE could perform before requesting engineering support, which are not currently outlined in the SOPs?

The KCS drafts for Jupyter can be reviewed here:

7025305 7025182 7025176

Feel free to reach out to me with any questions.

alexcreasy commented 11 months ago

@andrewballantyne could you take look at this issue please?

lucferbux commented 11 months ago

@maryfrances01 @andrewballantyne I think you might need to log this issue in the Notebook Controller Repo cc @harshad16 @atheo89

maryfrances01 commented 11 months ago

Thanks @lucferbux ! I think there is a way to just transfer it to another repo, but I don't see that option myself. @andrewballantyne let me know if I should just close this one and open a new one under Notebook Controller.

harshad16 commented 11 months ago

/transfer kubeflow

I think one needs to have admin rights on the org, or on both repo specifically to transfer the issue. we can co-ordinate with org admins to do that or hope prow command works. thanks for the issue, we will plan for this on the Notebook team end.

maryfrances01 commented 8 months ago

@harshad16 Would it be possible to get this in the next sprint so i can wrap this up this quarter?