splunk / jupyterhub-istio-proxy

JupyterHub proxy implementation for kubernetes clusters running istio service mesh
Apache License 2.0
32 stars 14 forks source link

Virtual service not being created #38

Open dlcrista opened 3 years ago

dlcrista commented 3 years ago

Followed the guide in https://medium.com/swlh/running-jupyterhub-with-istio-service-mesh-on-kubernetes-a-troubleshooting-journey-707039f36a7b and the virtual service is not being created for me. When I run the following command, I get no resources found..

image

harsimranmaan commented 3 years ago

Can you check the hub logs as well as the proxy container logs?

dlcrista commented 3 years ago

What should I be looking for?

I don't see the virtual service yaml anywhere by the way? Did you forget to include it in the tutorial?

harsimranmaan commented 3 years ago

Virtual services are not created using yaml but at https://github.com/splunk/jupyterhub-istio-proxy/blob/main/proxy/create.go#L50. To check the logs, I'd recommend that you follow https://kubernetes.io/docs/reference/kubectl/cheatsheet/#interacting-with-running-pods.

Check for things like misconfig, network errors etc, in the logs.

dlcrista commented 3 years ago

It turns out that the serviceaccount "default" in the jupyter namespace wasn't able to create virtualservices. I modified the rbac rules to allow it to create, list, get, delete virtualservices and it is now able to do so.

However, now when I log in, am now getting this upon logging in:

image

image

I think maybe there's something wrong with my clusterrolebinding definition for the hub serviceaccount. Because I had to make several modifications to it in order for this progress bar to even show upon logging in (the hub pod logs were showing errors such as):

    HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Tue, 18 May 2021 23:49:08 GMT', 'Content-Length': '281'})
    HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"events is forbidden: User \\"system:serviceaccount:jupyterhub:hub\\" cannot watch resource \\"events\\" in API group \\"\\" in the namespace \\"jupyter\\"","reason":"Forbidden","details":{"kind":"events"},"code":403}\n'

Now there's no more visible errors on the logs, it's just that spawning the pod ends up timing out. I didn't have any errors with jupyterhub when I use the out of the box helm chart.

harsimranmaan commented 3 years ago

Also don't forget to assign roles aptly. From the README.md

The service account used for deploying the jupyterhub-istio-proxy should have ability to list, get, create and delete istio virtual services in the namespace where the deployment is done. Refer Kubernetes RBAC for details.
dlcrista commented 3 years ago

Please review my previous comment:

image

I've already done this and virtual services are now being created. However, I am now getting an error that is unrelated to virtualservices. Pods are now timing out (and not being created) when I log into the hub. I can close this issue and open a new issue or we can continue talking about this new issue here.

harsimranmaan commented 3 years ago

It is hard to point out anything else without learning more details about setup. There are many ways of debugging k8s issues, logs, events, API server audit logs, even pod description of the failing pod.

harsimranmaan commented 3 years ago

You can try turning on debug logs on jupyterhub, exec in to the pod, try curling the proxy API to verify connectivity.

dlcrista commented 3 years ago

I've been doing a lot of debugging and it appears that the reason pods were timing out was because the user-scheduler service account also doesn't have the correct rbac permissions. I modified the permissions so user-scheduler is able to create pods and pods are now being created.

I also had to set c.KubeSpawner.service_account = "hub" otherwise the spawned user pod would crashbackloopoff (I didn't have to do this when I do helm install...)

I didn't have to do all these rbac configurations when I do helm install jupyterhub jupyterhub/jupyterhub --namespace jupyterhub, from my investigation it seems to be because the helm template way you're installing jupyterhub is misconfigured somehow. (I'm not able to do helm install when istio-injection is enabled. The helm install command always times out. Is that why you do helm template instead?)

Now that the pod (jupyterhub-asdf) is being created upon logging in (2/2), I am still getting another error. This time, I am getting this error:

image

I think you briefly mention this in the article. Will this solve this problem? image

dlcrista commented 3 years ago

It's been a while since I've last heard from you so I wanted to see if this is still on your radar?

By the way, the PR you made is quite out of date, since there's been several changes to the kubespawner code.

FYI, these variables are no longer in the latest code:

kubectl -n jupyterhub get cm/hub-config -o yaml | sed  "s/os\.environ\['HUB_SERVICE_HOST'\]/'hub'/g" | sed  "s/os\.environ\['PROXY_PUBLIC_SERVICE_HOST'\]/'localhost'/g" | sed  "s/os\.environ\['PROXY_PUBLIC_SERVICE_PORT'\]/'80'/g" | kubectl -n jupyterhub apply -f -
geoffreychalk commented 3 years ago

Hi, I have managed to deploy replacing the proxy with jupyterhub-istio-proxy. All up and running, it creates the VS for the hub without issues, but when i start the notebook it creates a virtual service with the pod's ip address, not service name/podname. have i done something wrong??? I have a fudge to work around this by manually starting a VS and service

zhuzhuzhenbang commented 2 years ago

It turns out that the serviceaccount "default" in the jupyter namespace wasn't able to create virtualservices. I modified the rbac rules to allow it to create, list, get, delete virtualservices and it is now able to do so.

However, now when I log in, am now getting this upon logging in:

image

image

I think maybe there's something wrong with my clusterrolebinding definition for the hub serviceaccount. Because I had to make several modifications to it in order for this progress bar to even show upon logging in (the hub pod logs were showing errors such as):

    HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Tue, 18 May 2021 23:49:08 GMT', 'Content-Length': '281'})
    HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"events is forbidden: User \\"system:serviceaccount:jupyterhub:hub\\" cannot watch resource \\"events\\" in API group \\"\\" in the namespace \\"jupyter\\"","reason":"Forbidden","details":{"kind":"events"},"code":403}\n'

Now there's no more visible errors on the logs, it's just that spawning the pod ends up timing out. I didn't have any errors with jupyterhub when I use the out of the box helm chart.

I'm having the same problem, can you post the detailed rbac settings please,thank u very much!

frobones commented 1 year ago

Adding this single headless service to the cluster allowed me to work with istio out of the box. No need to switch to jupyterhub-istio-proxy

apiVersion: v1
kind: Service
metadata:
  name: single-user
spec:
  type: ClusterIP
  clusterIP: None
  selector:
    app: jupyterhub
  ports:
    - port: 8888