canonical / data-science-stack

Stack with machine learning tools needed for local development.
Apache License 2.0
16 stars 5 forks source link

Question: How are DSS ClusterIP services available outside the kubernetes cluster ? #138

Open guybrush007 opened 4 months ago

guybrush007 commented 4 months ago

Hi,

I am getting interested in DSS and especially proposing it as a service to my team (data science in physics) on a shared host.

I have noticed that the notebooks and mlflow are exposed using k8s ClusterIP services, their IP should be only available inside the cluster, how are you able to expose this IP outside of the cluster ?

And as a follow-up question, do you plan to use other kind of service types, such as NodePort or LoadBalancer, or even ingress to enable to propose DSS from remote hosts ? For example, provisioning an AWS EC2 instance with a powerful GPU and sharing it with multiple members.

Note: Sorry if it is not the right channel to ask this kind of question.

carlosbravoa commented 2 months ago

While doing the same test, I changed the exposed notebook service from ClusterIP type to NodePort, but it is very hacky. It would be great if we could create a notebook using dss and specifying that we want it as a nodeport to access from outside the machine.

sudo microk8s kubectl patch svc my-tensorflow-notebook --type='json' -p '[{"op":"replace","path":"/spec/type","value":"NodePort"}]' --namespace dss

And connected to the machine using SSH and a tunnel to the given port: ssh -L 30633:localhost:30633 -i <my KP pem file> ubuntu@<the machine's external IP>