Closed consideRatio closed 3 years ago
One downside: z2jh is deployed in a multitude of environments, having everyone use identical dev and test environments means bugs may remain hidden until they hit production.
I acknowledge the benefit of having reliable, portable environments, especially for new contributors (vagrant may not be this, since it doesn't work for me to set up a basic VM today). However, I think that the downside of recommending a VM as the way to work outweighs the upside. What makes me uncomfortable recommending a VM for development is that I would never use it, and I have a hard time recommending tools that I would never consider using. Now, it's super useful for Windows folks who have a really hard time setting up an environment, where everything is so different from those of us in posix-land. But it also severely disadvantages anyone who has a nontrivially configured environment (non-bash shell for instance—zsh will be default on macOS next month—or well configured bash, kubectl, etc.). I personally can't stand using kubectl without all my customizations at this point. Completion's not enough.
Note that I'm separating running the test cluster in the VM from actually doing development there. I don't have a problem with running the cluster in a VM (we already recommend this with minikube), only adding the friction of using the VM as a development environment. I don't see any need to set up a VM to run tests, kubectl, and lint, etc. as long the host environment has access to the cluster. minikube basically exists to put a cluster in a VM and expose it to the host. If we decide that vagrant+kind is better, we can do the same, but I think the goal should also be the same: expose the cluster in the VM to the host. Having the commands exist in the VM is fine, as long as they can also be run from the host without manual vagrant ssh
.
One downside: z2jh is deployed in a multitude of environments, having everyone use identical dev and test environments means bugs may remain hidden until they hit production.
Hmmm, perhaps, but I'm not sure. The actual product we produce is a helm chart. I think the key issue here is not having a production similar flora of Kubernetes clusters, we simply rely on kind
which use kubeadm
currently.
@minrk regarding:
However, I think that the downside of recommending a VM as the way to work outweighs the upside.
I'm very happy to rephrase and suggest there are options to the VM way of development, especially for someone experienced, but currently not happy about trying to maintain instructions for alternative ways of setting local development and testing.
When you wrote the sentence above, did you refer to maintaining instructions only for one way, or suggesting there was only one good way to do things, or both?
When you wrote the sentence above, did you refer to maintaining instructions only for one way, or suggesting there was only one good way to do things, or both?
I think both. If we use kind, I'd like our tooling to work for kind "as advertised" i.e. working on mac/linux/windows. I've no objection to having a vagrant configuration as an available and "recommended if you don't want to think about installation" shortcut to getting up and running, but I think we should support folks working with these tools installed on their system. i.e. separate the vagrant stuff as one version of the "get your environment set up" stage and then the while-you-are-working tasks assume you have a satisfactory environment, but do not assume that it's the VM. Some will assume that you have kind, some only kubernetes+helm, etc.
There are a couple of levels of assumptions about the environment, with trade-offs about how much freedom users have (freedom to make choices, but also freedom to make things not work!)
As much as possible, we should have instructions that are at level one and few that are at level 3.
If the current instructions are separated into "setting up the environment" and "development workflow" where the development workflow does not assume the VM, but works if you've set it up, and you don't want to write docs for getting setup with kind natively, that's okay. I'd be happy to take a stab at the "native" version of getting set up with kind.
To be clear, I'm 100% okay if folks feel that assuming the cluster in the VM is the right approach, I just want to make sure that if we do that it's more like minikube where commands, etc. are issued from the host and not via vagrant ssh
. That was my only objection to the VM setup.
I'm happy using vagrant at the moment I had hoped kind
would make it possible to avoid it, so 👍 if the instructions can be split into:
kind works fine without being in a VM on my Ubuntu at least. Any automation to setup kind/kubectl/helm/kubeval/python3 + dependencies and a safe guard from for using kubectl/helm on the wrong cluster goes away.
I just want to make sure that if we do that it's more like minikube where commands, etc. are issued from the host and not via vagrant ssh.
I think that this means to never leave the native terminal, but instead wrapping vagrant ssh
with something whenever it is to be used. I think at this point, then it makes great sense to not use a VM at all. I think a wrapper around in-VM interactions would introduce a lot of magic and complexity that is unsustainable and provides little value.
I'm trying to evaluate what it would mean to support developers from Linux, Mac, and Windows without a VM, where one could run CI tests and debug the cluster after test failures, while minimizing risk of the user making a mistake working on a unrelated cluster, also while not making something too complicated to maintain. I'd like to avoid having much logic specifically for the CI and other specifically for local development.
dev-requirements.txt
.My thoughts are still in flux.
I'm trying to come up with alternative ways that works on multiple platforms, in a customizable way, where it is hard to misuse both kubectl
and helm
, and easy to get started with, while having something that is sustainable to maintain with the experience that we failed to maintain the latest versions development instructions properly.
kubectl
/ helm
/ kind
/ kubeval
kind
, minikube
, custom, ...)kubectl
for use with the k8s cluster
export KUBECONFIG=<config path>
(Mac / Linux)set KUBECONFIG=<config path>
(Windows?) (ref: offical docs)--kubeconfig
for both kubectl
and helm
--context
for kubectl
and --kube-context
for helm
HELM_HOME
or --home
for helm
.kube/config
file that we reference. The KUBECONFIG
paths will be used relatively to the working directory when executing kubectl
.chartpress
and --commit-range master..HEAD
helm upgrade
kubectl exec
to run code on podshelm upgrade
and go from one version to anotherhelm test
in a pod/container. It is a practice followed by the helm/charts repository's charts to have such tests. It would also allow for a verification of functionality after installation for chart users. But, it would also add some complexity... It would be best if we could do both, but it would also be hard to support it all.helm test
pod I think, as they would need to modify its own helm release and things could end up being quite weird. It don't think it is important that this test can be run locally during development though.It has been a challenge to develop kubespawner in interaction with Z2JH for example, this could perhaps be figured out properly in the contribution section as well.
.env
file where KUBECONFIG
and HELM_HOME
can be set, for example to .kube/kind-config-<kind cluster name>
, then use python-dotenv from python scripts wanting to use kubectl, optionally like flask does it.--commit-range master..HEAD
is good if developing on another branch, but not if working directly on master, one could add origin/master though but that would suddenly assume we have a remote named origin that is relevant to compare with, it should preferably be the upstream.There are some things that becomes a bit much and would benefit from automation, such as using when using kind and rebuilding the images.
(read below the horizontal line for the two points I want to make, up here for some backstory)
A comment (somewhat from the sidelines because I haven't digested all of the above): in another project I have to start a minishift and deploy to it before I can start development. This is because there are several services that depend on each other and the simplest way to get all of them up and behaving roughly like in prod is minishift + helm deploy
. To work on one of the services you have to then use a tool like telepresence to perform "magic" that teleports a local (outside minishift) process into the cluster. Why a local process? Because I want to use my favourite editor to edit code, people want to attach a debugger to the process, writing code+create container+redeploy is too tedious.
Two things I dislike is: minishift takes ~5 minutes to start and having to use telepresence massively increases developer complexity. This means there are strong disincentives in place to do quick fixes. You start minishift and get a coffee then work. telepresence mostly just works but sometimes it does weird stuff. I have to pay the 5min startup penalty every time I start work because I can't leave stuff setup, ready to go (would need a laptop per project).
The first point for this discussion: we should try and work on keeping the time-to-active-developing as low as possible for repeat developers. It isn't just about complexity/number of steps/automation. To make up a hypothetical scenario:
As someone who works on the project frequently I prefer to invest 20min one time so that the more frequent action of starting work takes only 30s.
-> one time costs are fine if they reduce repeat developer cost.
The second point: I have spent a lot of time recovering from "let us auto setup stuff for you" scripts that some projects use and advertise as the way to get developing. These scripts are great in achieving the goal of forcing the environment to be compliant to the needs of that particular project. They are also great at breaking the environment for all the other projects I work on. This takes the form of overwriting configuration files, installing tools, modifying environment variables, etc. In theory these are great scripts to have, in practice they find themselves in a carefully tuned environment that isn't quite like what they expected. The result is that they are like a bull in a china shop, not a back country hiker (leave no trace).
-> explicit instructions are better than implicit instructions.
using wrong cluster, configure
kubectl
to use the right cluster
I use (a minimally modified version of) https://github.com/ahmetb/kubectx to switch between contexts. I think it doesn't create any additional config files and relies mostly on ~/.kube/config
. When I start minikube
the right thing happens (switch to minikube context). oc login
also works and switches context (this is part of OpenShift tooling).
It is great because it manages to co-exist with several (otherwise pretty opinionated) tools. It also means I don't have to specify --context
or the like on the command line. To list all pods on the GKE mybinder cluster: kctx binder-prod && kubectl get pods
, kctx binder-ovh && kubectl get pods
to work on the OVH deployment, kctx minikube && kubectl get pods
back on minikube.
One thing that is annoying is that it switches context globally for all your terminals.
A while back Min linked me to his snippets but I've not found time to translate them to ZSH/my weird shell config yet. (which also tells you that the global switch is annoying but not that annoying).
-> maybe adding call outs to the docs at points where we know pain exists or where we have found good helpers to reduce pain is a way forward. Instructions like "Now check the list of pods with kubectl get pods --namespace foo
to see that the hub pod is running and well. If not use kubectl logs <hub-pod-name> --namespace foo
to look at the logs." could have a call out "To avoid having to type out the namespace each time checkout [this little helper]() for linux and other helper for OSX."
@betatim thanks for writing these thoughts down!
I really want to get these local development instructions to be very useful, I want to feel I can recommend friends that this is a project that they can contribute to and learn a lot while doing it.
My summary of what I heard you say in my words:
Perhaps another way of approaching this is the types of developer who will contribute:
A fully contained VM is very helpful for the first for the first group, but since you're only a K8s beginner once it penalises the greater pool of devs who've worked on other k8s related projects since they need another custom development environment, and penalises the third group for the reasons @betatim mentioned.
I think focussing on 1 and 3 for now is reasonable:
kubectl
and helm
are required e.g. by setting KUBECONFIG
Then iterate and refine
Below is an outline of my current idea of the local-dev instructions summarized. Important for this idea is that:
For this to happen, I've come up with an approach, where the process is to follow a single path with A/B options along the way that doesn't influence the future choices. I've also moved away from the use of bash scripts for anything than quick automated installation of binaries for the CI system and the VM, as well as the publish script which is only to be run by either an advanced user or by the CI system anyhow.
./dev.py kind start [--recreate]
- setup and initialize calico etc on any platform (unknowingly if your in a VM) assuming you have required binaries available.
b) Manually get a Kubernetes cluster setup and ready with required dependencies../dev.py upgrade
(chartpress, automatic detection if it is a kind cluster and if so also load locally built images, helm upgrade, and kubectl port-forward to the proxy-public service)
b) Manually doing the steps of a)./dev.py test
(run pytest with suitable parameters and output relevant logs if failing)
b) Do it manuallyNote, if you use manual run, how would ./dev.py upgrade
know about your cluster? Well it wouldn't and it would instantly complain that KUBECONFIG
hasn't been explicitly set within your .env file which is bootstrapped. This allows for the same cluster to be repeatedly used by the ./dev.py
script, and the risk of messing with a production cluster or similar goes down a lot because you must have explicitly asked for a certain KUBECONFIG to be used... Hmmm now that I think of it, you can have different context within the KUBECONFIG as well, so I should make sure that one also specifies the context.
UPDATE: Resolution
The contributing docs work without a dedicated VM at this point, and we opted to avoid it as it adds quite a bit of machinery which can be hard to maintain for example.
In #1422 I've planned for use of a VM, and after https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/1422#discussion_r331005211 I figured it would make sense to create a dedicated issue to discuss if that makes sense to focus on a VM based development setup.
Why CONTRIBUTING.md should focus on a VM based setup
A all encompassing virtual environment
In Python there are plenty of tools to setup a virtual environment to avoid mixing up Python versions or dependency requirements etc. in various projects. In this repo, we are developing something that make us of a lot of tools, of which many could benefit from a virtual environment:
kubectl
,helm
,kind
,python
,kubeval
. Consider for example the folder~/.helm/
and~/.kube
and the configuration that is updated with use ofkubectl
andhelm
.A common experience for Windows/MacOS/Linux
I don't think it is sustainable to have instructions for multiple OS, but like this, not only does it become possible but it becomes easier as we would focus entirely on Ubuntu 18.04 in the VM. The only OS split of instructions would reside in setting up the VM.
This makes Windows based contributors first class contributors.
Locally reproducible CI tests
We have CI tests and they are running on Ubuntu. If we want these to be easy to re-execute locally as part of a development process, we must ensure they don't assume anything about the OS etc. With a VM running Ubuntu 18.04 locally, and a TravisCI VM running 18.04, we are in a very good spot to make them re-executable locally without much effort.
Opinionated so others don't have to be
Providing many options on how to do something is not only more complicated, but can also add cognitive load on the person facing the options. I'd like for us to provide clearly recommended option, and not suggest other options unless we can properly maintain their functionality.
A opinionated development environment
In this comment @minrk considers this:
But on the other hand, I find it to be a bit of a feature. If we have a VM where things are separated from how it is locally, we can ensure that
kubectl
autocompletion is setup by default for example. Having these excellent features enabled by default, you get their value without the initial pain you need to experience in order for you to realize you want to invest time to resolve it.I acknowledge the downside, but I think the upside is worth more.