stackabletech / stackable-cockpit

Home of stackable-cockpit, stackablectl and stackable-cockpitd
https://docs.stackable.tech/management/stable/
Other
8 stars 3 forks source link

Research option to uninstall Stacks and Demos #187

Open sbernauer opened 2 years ago

sbernauer commented 2 years ago

As a user I want to be able to uninstall a stack or demo by itself, after I'm done looking at it to have a clean cluster again and not have to remove things manually, which is error prone.

Removing

Related ADR: https://docs.stackable.tech/home/stable/contributor/adr/adr031-resource-labels

sbernauer commented 2 years ago

Just dropping a comment i send in slack for reference

Addressed all comment in my current branch, will be merged soonish Regarding the uninstall: For me there are some questions left. When uninstalling a demo should we uninstall the stack and release as well? Should we remove e.g. the SupersetDB resource and the superset init k8s Job? (If not future equally named Supersets will not start) Should we remove the pvcs of the kafka? the postgres? the minio? IMHO this is related to https://github.com/stackabletech/issues/issues/237 and we have https://github.com/stackabletech/stackable-cockpit/issues/187 for this on stackablectl side of things

adwk67 commented 2 years ago

Here's a note from the same slack chat:

Yes, I agree it may be problematic if uninstall demo removes everything the demo had installed as a) some pre-existent things (like operators) would be ignored on install but not on uninstall, and b) subsequent deployments may depend on demo components. I think an uninstall should at least make it possible run the install again, resulting in a clean setup. That may then require leaving some "demo install"-ed things untouched, if we feel that we may compromise users' environments. It depends a bit on who the target audience is - I tend to the opinion that "demo is demo" and is therefore unsuited to anything other than a test environment. I think whichever way we go, we can cover such scenarios with an explanatory note - e.g. "uninstall will do X, Y and Z: ensure your environment is appropriate for this!"

fhennig commented 1 year ago

I had some thoughts on this:

Closing my other ticket in favor of this one https://github.com/stackabletech/stackablectl/issues/186

Maleware commented 1 year ago

I went through some stacks and demos and figured out some issues:

I just realised that if we have a demo in a namespace and then throwing it away with k delete namespace there are some leftovers on the cluster in:

This stuff is preventing to throw away a namespace, create a new one and install another demo on it.

As a solution (having open shift in mind):

  1. Every operator should be installed in his own ns (demos and stack in target ns (default or user given))
  2. stackablectl should check weather or not the operators used by the demos are installed and skip the install if so

This will give the user the freedom to install a demo or stack in certain ns, throw it away and install the next demo. Additionally this gives us the capability to install various demos in different namespaces on one cluster.

I'm aware this feels like a stackablectl demo uninstall light, but by doing this we can avoid cleaning up each and every resource and just reuse them. Another thing solved by this, is that you currently can't kill a stack or demo by just simply delete the namespace it's in. As long as pvc's are attached to a pod, the secret operator is mandatory to kill those pods.

Maleware commented 1 year ago

On another note:

Since I'm testing the capabilities of stackablectl and the usage of stacks atm, I realised that we have monitoring and logging as stacks available. However, if you install any stack before you are not able to install any other stack besides that. I think since we have now meta features like logging and monitoring we should be able to install them via sctl easy and fast.

As far as I can see, another advantage of installing operators in their own ns is that we have this enabled out of the box too.

soenkeliebau commented 1 year ago

Is it new that we cannot install another stack beside an existing one?

I regularly installed the datalake demo and then afterwards did sctl s in logging and sctl s in monitoring to get those features as well.

Maleware commented 1 year ago

That's a good comment! Haven't found this in the documentation yet! But it looks like... it works :)

I think it is not working if you try installing it in another namespace as far as I can see

soenkeliebau commented 1 year ago

That might create issues, yes... the vector aggregator service for example would not resolve for the pods in a different namespace I think ..

Maleware commented 1 year ago

Yes, this in addition. However, you can not install another stack in a seperate namespace since sctl gives you an error like this:

[INFO ] Installing commons operator in version 23.1.0
panic: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "commons-operator-clusterrole" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "monitoring": current value is "data-lakehouse"
soenkeliebau commented 1 year ago

Ah, so the issue there is that it tries installing the release again .. might be useful to have a switch "--skip-release" to install just the stack and demo portions, for clusters that were pre-installed with a release.

Maleware commented 1 year ago

Ah! That's a way easier solution then I had in mind.

Techassi commented 1 year ago

Ah, so the issue there is that it tries installing the release again .. might be useful to have a switch "--skip-release" to install just the stack and demo portions, for clusters that were pre-installed with a release.

This will be possible when https://github.com/stackabletech/stackable-cockpit/pull/79 is merged.

fhennig commented 9 months ago

I just updated the ticket a bit to include some new thoughts we had and also tried to incorporate the discussion in here so far.