Closed Equanox closed 4 years ago
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
Awesome, thanks for submitting a PR on K8s. It has been something that I have been wanting to get working for a while now.
Overall this PR looks good, and I absolutely love that you added the README.md file for for it.
Scaling in general for XMiDT inside of K8s is a hard problem, that I haven't found a good solution for because an outside client needs to be able to talk to any talaria server at any given time. It appears that StatefulSets solves this problem. As for the other services, using round robing between the pods will be totally fine.
The config file can be overwritten with environment variables.
For example, if you want to change the primary address port of talaria to 7777
you can set the environment
variable TALARIA_PRIMARY_ADDRESS=":7777"
Sorry for the ambigous pull requests. Closing this due to #14 @kcajmagic can you repost your comment at #14?
This Pull Request is WIP and hopefully prevents someone from doing duplicated work. It also should start a discussion about possible scaling implications when deploying xmidt with k8s here at DTAG. The added k8s deployment is heavily based on the provided docker-compose deployment. A helm chart is added to express dependencies to Consul and Prometheus(wip). Most config files (from ./deploy/docker-compose/docFiles ) are added to ConfigMaps in their respective files originated at ./deploy/kubernetes/xmidt-cloud/templates.
As k8s needs access to a docker image registry it would be nice to add public repositories on hub.docker.com for each service (petasos, talaria, etc.). Take a look at ./deploy/kubernetes/xmidt-cloud/values.yaml. It would be even better to automate this with a CI process. Maybe Github Actions?
Scaling: Right now, only one talaria instance is deployed, due implications with service discovery. The thing is, each talaria instance registers itself in Consul with a hardcoded address on which it is reachable by others.
Excerpt from talaria config (./deploy/kubernetes/xmidt-cloud/templates/talaria.yaml)
This doesn't play well with K8s scaling feature (replicas), as traffic is normally load balance across a ReplicaSet. In our case it is important to route requests to the correct talaria instance as only one specific talaria instance holds the websocket connection to a device. One solution could be to use k8s "StateFulsets" promise about "Stable, unique network identifiers." https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#stable-network-id, but i'm not yet sure how to pass this information down to talarias config. This needs some further investigation on my side. Or maybe you have a better solution how to tackle this.
I don't have too much insights to the other services, so there might be scaling implications as well. Even though for experimentation with a k8s deployment one talaria instance seems to be enough.