Closed tumido closed 2 years ago
My current dev environment configuration uses:
# kfdef.yaml
---
apiVersion: kfdef.apps.kubeflow.org/v1
kind: KfDef
metadata:
name: opendatahub
spec:
applications:
- kustomizeConfig:
repoRef:
name: manifests
path: odh-common
name: odh-common
- kustomizeConfig:
parameters:
- name: s3_endpoint_url
value: s3.odh.com
repoRef:
name: manifests
path: jupyterhub/jupyterhub
name: jupyterhub
- kustomizeConfig:
overlays:
- additional
repoRef:
name: manifests
path: jupyterhub/notebook-images
name: notebook-images
repos:
- name: manifests
uri: "https://github.com/opendatahub-io/odh-manifests/tarball/v1.1.1"
# kustomization.yaml
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- https://raw.githubusercontent.com/operate-first/apps/master/cluster-scope/base/operators.coreos.com/subscriptions/opendatahub-operator/subscription.yaml
- kfdef.yaml
- https://raw.githubusercontent.com/thoth-station/helm-charts/main/charts/meteor-pipelines/templates/byon-validate-jupyterhub-image.yaml
- https://raw.githubusercontent.com/tumido/helm-charts/byon-import-image/charts/meteor-pipelines/templates/byon-import-jupyterhub-image.yaml
- https://raw.githubusercontent.com/tumido/helm-charts/byon-import-image/charts/meteor-pipelines/templates/byon-noop.yaml
@oindrillac @harshad16 task: request a stage env for BYON from @open-services-group/wg-devsecops-leads .
ack, will discuss with the WG DevSecOps team and will respond soon.
Updates from WG meeting 03/09/2022:
@harshad16 can you please add WG members to slack chat where this discussion is taking place.
@harshad16:
We need:
@tumido thanks for sharing the information:
@dlabaj @lavlas how can we set up CI/CD builds for odh-dashboard@BYON branch so we have an usable image to deploy along with BYON?
@harshad16
- we would be using the osc-cl1 cluster. on which the ODH is deployed with help of this odh-manifest branch.
Why this branch and not v1.1.2
?
- For getting an ODH dashboard getting deployed with a custom image, please update the custom image details in here, this would be picked and deployed on the cluster.
We should be able to override this in kustomization.yaml
correct?
- For BYON pipelines. If I understand correctly, it has a pre-requirement of openshift-pipelines.
Yes, should we install the pipelines separately via apps repo or via kfdef (it's a ODH component as well)?
The desired way would be that the pre-requirements and the BYON is put under a new folder BYON in here
Is this desired? What about we put it as an overlay to jupyterhub
component? What would be preferred @lavlas ?
and then the kdef is updated for this feature to be installed in osc-cl1 cluster, by updating it here
Yup, will do.
- we would be using the osc-cl1 cluster. on which the ODH is deployed with help of this odh-manifest branch.
Why this branch and not
v1.1.2
?
you are correct, at the moment of my comment there was on branch v1.1.0. the v1.1.2 is now available here.
- For getting an ODH dashboard getting deployed with a custom image, please update the custom image details in here, this would be picked and deployed on the cluster.
We should be able to override this in
kustomization.yaml
correct?
yes we can do that :)
- For BYON pipelines. If I understand correctly, it has a pre-requirement of openshift-pipelines.
Yes, should we install the pipelines separately via apps repo or via kfdef (it's a ODH component as well)?
As this is ODH component, we want that to be installed via ODH, so it can also get added to ODH, without many changes. however, in the previous call, it was mentioned that the openshift-pipeline is already installed in ODH, so maybe we can skip this.
The desired way would be that the pre-requirements and the BYON is put under a new folder BYON in here
Is this desired? What about we put it as an overlay to
jupyterhub
component? What would be preferred @LaVLaS ?
I thought about this as well, if we can have it in jupyterhub that is great as well.
The cluster is all setup: console url: https://console-openshift-console.apps.odh-cl1.apps.os-climate.org/k8s/ns/opf-jupyterhub-stage/pods namespace: opf-jupyterhub-stage jupyterhub URL: https://jupyterhub-opf-jupyterhub-stage.apps.odh-cl1.apps.os-climate.org/hub/spawn
Also, the deployment pr is being made: https://github.com/operate-first/odh-manifests/pull/10
once we have the odh-dashboard image, we are good to merge the pending pr. That would be it and byon would be in it dev environment.
@dlabaj @LaVLaS how can we set up CI/CD builds for odh-dashboard@BYON branch so we have an usable image to deploy along with BYON?
I was discussing this with @harshad16 and we will make sure there is a byon-latest
image built from the BYON
branch. I have a buildConfig manifest I am working on that should allow an cluster build and deployment of a development image
As this is ODH component, we want that to be installed via ODH, so it can also get added to ODH, without many changes. however, in the previous call, it was mentioned that the openshift-pipeline is already installed in ODH, so maybe we can skip this.
I think ODH will be making OpenShift Pipelines as a "core" product soon and it will always be included in an ODH deployment. For right now, we will have to make sure pipelines are included in the BYON kfdef
Is this desired? What about we put it as an overlay to jupyterhub component? What would be preferred @LaVLaS ?
I agree that it should be in an overlay for JupyterHub
Seems like dashboard has issues: https://console-openshift-console.apps.odh-cl1.apps.os-climate.org/k8s/ns/opf-dashboard/pods/odh-dashboard-76bdcd5f4c-xtlvl/logs
I'm unable to access that, sorry. I'm getting
However comparing the manifests in https://github.com/operate-first/odh-manifests/tree/osc-cl1-byon/odh-dashboard/base against https://github.com/opendatahub-io/odh-dashboard/tree/BYON/install/odh/base I see some differences still present. May that be the issue?
34c34
< - build.openshift.io
---
> - ""
36,38c36,37
< - builds
< - buildconfigs
< - buildconfigs/instantiate
---
> - configmaps
> - secrets
39a39,40
> - create
> - delete
41a43,44
> - patch
> - update
44c47
< - rbac.authorization.k8s.io
---
> - batch
46c49,51
< - rolebindings
---
> - cronjobs
> - jobs
> - jobs/status
47a53,55
> - create
> - delete
> - get
48a57,59
> - patch
> - update
> - watch
50c61
< - apps.openshift.io
---
> - image.openshift.io
52c63
< - deploymentconfigs
---
> - imagestreams
53a65
> - create
56,58d67
< - watch
< - create
< - update
60c69,77
< - delete
---
> - apiGroups:
> - build.openshift.io
> resources:
> - builds
> - buildconfigs
> verbs:
> - get
> - list
> - watch
148,155d164
< - apiGroups:
< - user.openshift.io
< resources:
< - groups
< verbs:
< - get
< - list
< - watch
203a213
> type: LoadBalancer
226,237d235
< affinity:
< podAntiAffinity:
< preferredDuringSchedulingIgnoredDuringExecution:
< - podAffinityTerm:
< labelSelector:
< matchExpressions:
< - key: app
< operator: In
< values:
< - odh-dashboard
< topologyKey: topology.kubernetes.io/zone
< weight: 100
239c237
< - image: quay.io/opendatahub/odh-dashboard:latest-byon
---
> - image: quay.io/modh/odh-dashboard:v1.0.11
266,267c264,265
< cpu: 400m
< memory: 400Mi
---
> cpu: 500m
> memory: 1Gi
269,270c267,270
< cpu: 200m
< memory: 100Mi
---
> cpu: 300m
> memory: 500Mi
> imagePullSecrets:
> - name: addon-managed-odh-pullsecret
277d276
< haproxy.router.openshift.io/hsts_header: max-age=31536000;includeSubDomains;preloa
ack, will update the manifests based on the changes you suggest.
about access, you are cluster-admin: https://github.com/operate-first/apps/blob/13b504ab0c2525e9d620d33cfa9c381caa2fa9ae/cluster-scope/overlays/prod/osc/osc-cl1/groups/cluster-admins.yaml#L10 you should access for all the logs , please re-check.
about access, you are cluster-admin: https://github.com/operate-first/apps/blob/13b504ab0c2525e9d620d33cfa9c381caa2fa9ae/cluster-scope/overlays/prod/osc/osc-cl1/groups/cluster-admins.yaml#L10 you should access for all the logs , please re-check.
Well, it's not about permissions, it's rather auth issue... I've opened https://github.com/operate-first/support/issues/552
ack, Another issue with byon deployment. The operator is not able to resolve kfdef with byon manifest The error from the logs:
level=error msg="Error evaluating kustomization manifest for byon: accumulating resources: recursed accumulation of path 'base': accumulating resources: accumulating resources from 'https://raw.githubusercontent.com/thoth-station/helm-charts/master/charts/meteor-pipelines/templates/byon-validate-jupyterhub-image.yaml': open /tmp/opf-jupyterhub-stage/jupyterhub/kustomize/byon/base/https:/raw.githubusercontent.com/thoth-station/helm-charts/master/charts/meteor-pipelines/templates/byon-validate-jupyterhub-image.yaml: no such file or directory"
seems like it kfdef resolver is not able to read remote url, instead wants resource to be in the directory. cc: @tumido @LaVLaS
This may be due to the kustomize version in the kfctl version but I can't remember if remote resources are supported or not
I don't see other option than work around that - I've cherrypicked all the pipelines/task manifests into odh-manifests in operate first. https://github.com/operate-first/odh-manifests/pull/19
In addition to that further fixes to the kfdef are needed, see https://github.com/operate-first/apps/pull/1892
@harshad16 what can I do to trigger build for ODH Dashboard for BYON?
https://github.com/opendatahub-io/odh-dashboard/tree/BYON has latest commit 4 days ago
While quay image is 13 days old
https://github.com/opendatahub-io/odh-dashboard/tree/BYON has latest commit 4 days ago
I had checked with ODH team, they do manual build we would have to contact @LaVLaS
I've changed things around so we can use our own manual build in the oeprate-first quay for now, to speed things up.
Another show stopper appeared. OSC-CL1 doesn't have Openshift Pipelines, which is a blocker. It had Tekton deployed instead, which is not enough for us - we need to have ClusterTask
s available, namely for openshift-client, because we're not gonna be reimplementing that, we want to use what's provided and version matched by the cluster itself.
So, it seems OpenShift Pipelines are fighting with CertManager or something installed on the cluster...
2022/04/27 04:35:40 http: TLS handshake error from 10.130.0.1:42582: remote error: tls: bad certificate
2022/04/27 04:35:40 http: TLS handshake error from 10.129.0.1:60790: remote error: tls: bad certificate
2022/04/27 04:35:40 http: TLS handshake error from 10.129.0.1:60788: remote error: tls: bad certificate
2022/04/27 04:35:40 http: TLS handshake error from 10.128.0.1:37686: remote error: tls: bad certificate
2022/04/27 04:35:40 http: TLS handshake error from 10.128.0.1:37692: remote error: tls: bad certificate
2022/04/27 04:35:41 http: TLS handshake error from 10.130.0.1:42594: remote error: tls: bad certificate
2022/04/27 04:35:41 http: TLS handshake error from 10.128.0.1:37696: remote error: tls: bad certificate
/cc @harshad16
The issue is fixed now, the operator was jammed earlier. please use it. :+1:
Resolved, environment is verified to be available and working at https://odh-dashboard-opf-jupyterhub-stage.apps.odh-cl1.apps.os-climate.org/
Is your feature request related to a problem? Please describe. I want:
Describe the solution you'd like For stage environment provide a namespace or cluster in Operate First with a BYON deployment from thoth-station/helm-charts master For dev environment provide a guide/steps and a kustomize to apply to get a dev environment up and running in no time. A dev file for code ready containers maybe?
Describe alternatives you've considered n/a
Additional context n/a