Qarik-Group / bootstrap-kubernetes-demos

Bootstrap Cloud Foundry, Knative, Kpack and other systems onto Kubernetes
31 stars 7 forks source link

[wip] Azure bootstrapping #21

Closed drnic closed 4 years ago

drnic commented 4 years ago

Based on #20 by @immae1

drnic commented 4 years ago

@immae1 could you give this PR/branch a try?

drnic commented 4 years ago

Rebase submodules to ensure you get helm 2.14.3 (and not broken 2.15.0)

immae1 commented 4 years ago

wow thanks a lot for integrating my script into your repo, what an honor! To verify that i use the right submodules i checked out your repo into /tmp/bootstrap-kubernetes/ The AKS Cluster creation works fine, but there is an error in the helm-tiller-manager. it seems that the .versionsfile of the subrepo is ignored.

Workaround: add the helm version in .versions of this repo (bootstrap-kubernetes-demos) too. But in this file the keyword helm* appears in "scf-helm-file=scf-3.0.0-8f7a71d1.tgz" so the grep fails and create a wrong url for the curl sequence... so if i change the grep to an exact grep command(version=$(grep '**^**helm' .versions | cut -d= -f2) in helm-manager script so that only the correct helm version is used. i have the correct version in the variable but the helm installation won't start. if i comment out:

[[ "$(command -v helm)X" == "X" || "$(helm version --client | grep "$version")X" == "X" ]] &&

it seems the installation will run, but it fails with:

/tmp/bootstrap-kubernetes-demos/vendor/helm-tiller-manager/bin/helm-manager: Zeile 100: _helm: Befehl nicht gefunden

Maybe there is a issue with differences of linux and darwin ?!

And a other thing: How i have to use the suspend / resume options for azure aks? i tried: bootstrap-kubernetes-demos up --az suspend (starts the AKS up option) bootstrap-kubernetes-demos -az suspend (gives me only the menu output) Bootstrap Kubernetes and/or subsystems for demonstrations: up [--gke|--google] -- bootstrap new Google GKE cluster [--az|--azure] -- bootstrap new Azure AKE cluster

i think the bootstrap-kubernetes-demos script cannot set the wanted params suspend / resume?!

maeddes commented 4 years ago

I am also seeing a problem with helm-manager

helm-manager up installing helm vscf-3.0.0-8f7a71d1.tgz cf-operator-v0.4.0+1.g3d277af0.tgz into /Users/matthiashaeussler/git/gcloud-stuff/bootstrap-kubernetes-demos/vendor/helm-tiller-manager/bin/ curl: (3) Illegal characters found in URL tar: darwin-amd64/helm: Not found in archive tar: Error exit delayed from previous errors.

drnic commented 4 years ago

I was able to reproduce some of the issues mentioned, for which I apologize having existed. They are fixed.

Please a fresh git clone of the branch (note the --recursive flag too):

git clone https://github.com/starkandwayne/bootstrap-kubernetes-demos -b azure --recursive
cd bootstrap-kubernetes-demos
direnv allow
bootstrap-kubernetes-demos up --az --scf
maeddes commented 4 years ago

Thanks for the fix. This is working for me now. The script completes successfully.

Working in a way that the deployment of SCF works in the same way as if install manually. I am still running in the issue described here: https://github.com/SUSE/kubecf/issues/46

The droplet is getting built but can't be pushed to the container registry and the pod tries to pull an image, which is not there. Same error in the bits service:

scf-bits-v1-0 bits-service-bits-service {"level":"error","ts":1571814539.1479647,"caller":"http/server.go:3010","msg":"http: TLS handshake error from 10.240.0.4:50934: remote error: tls: bad certificate"}
immae1 commented 4 years ago

For me on linux the AKS part works. But i have to fix one line : Replace basename with dirname in bootstrap-kubernetes-demos line 49(for requested in $(find state/systems/* -print0 | xargs -0 dirname); do ) I run: bootstrap-kubernetes-demos up --az --scf and after the nodes i get the message: No systems selected.

Running the seprated scripts works: ./bin/bootstrap-system-helm up
tiller pod is ready: kube-system pod/tiller-deploy-67b7f5bfdb-ws9cw 1/1 Running 0 48s but helm doesn't work as expected if i run:

helm install --tls stable/wordpress
Error: failed to download "stable/wordpress" (hint: running `helm repo update` may help)
immi@immi-VirtualBox:/tmp/newtrynew/bootstrap-kubernetes-demos$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "stable" chart repository
Update Complete.
immi@immi-VirtualBox:/tmp/newtrynew/bootstrap-kubernetes-demos$ helm install --tls stable/wordpress
Error: failed to download "stable/wordpress" (hint: running `helm repo update` may help)

installing scf also fails:

immi@immi-VirtualBox:/tmp/newtrynew/bootstrap-kubernetes-demos$ export HELM_TLS_VERIFY=${HELM_TLS_VERIFY:-true}
immi@immi-VirtualBox:/tmp/newtrynew/bootstrap-kubernetes-demos$ ./bin/bootstrap-system-scf up
Install Cloud Foundry/Eirini (scf) for scf.suse.dev
--> Using scf-3.0.0-8f7a71d1.tgz
+ helm upgrade --install --namespace scf scf https://scf-v3.s3.amazonaws.com/scf-3.0.0-8f7a71d1.tgz --set system_domain=scf.suse.dev --set features.eirini=true
Release "scf" does not exist. Installing it now.
Error: validation failed: unable to recognize "": no matches for kind "BOSHDeployment" in version "fissile.cloudfoundry.org/v1alpha1"

I think the parameters from the boostrap-kubernetes-demos does not passed into the called scripts. or something with the state under linux fails... And calling suspend the cluster vms also wont't work as expected (by calling it trough boostrap-kubernetes-demos up --azure suspend)

Do you need further information's?

Many thanks to you đź‘Ť

drnic commented 4 years ago
helm install --tls stable/wordpress

This doesn't work because I decided to change the default repo that was added. https://github.com/starkandwayne/bootstrap-kubernetes-demos/blob/azure/bin/bootstrap-system-helm#L8

I'll consider changing it back.

drnic commented 4 years ago
+ helm upgrade --install --namespace scf scf https://scf-v3.s3.amazonaws.com/scf-3.0.0-8f7a71d1.tgz --set system_domain=scf.suse.dev --set features.eirini=true
Release "scf" does not exist. Installing it now.
Error: validation failed: unable to recognize "": no matches for kind "BOSHDeployment" in version "fissile.cloudfoundry.org/v1alpha1"

I've never seen this. How do I reproduce it?

drnic commented 4 years ago

How i have to use the suspend / resume options for azure aks?

bootstrap-infrastructure-azure suspend
bootstrap-infrastructure-azure resume
drnic commented 4 years ago

Replace basename with dirname

They are the opposite of each other; they aren't interchangeable.

$ pwd
/Users/drnic/Projects/kubernetes/bootstrap-kubernetes-demos
$ basename `pwd`
bootstrap-kubernetes-demos
$ dirname `pwd`
/Users/drnic/Projects/kubernetes
immae1 commented 4 years ago

To reproduce: i have done the following:

  1. bootstrap-kubernetes-demos up --az --scf this failed with no systems selected

  2. export HELM_TLS_VERIFY=${HELM_TLS_VERIFY:-true} && ./bin/boostrap-systems-helm up

  3. ./bin/bootstrap-systems-scf up

Ok if basename and dirname aren't interchangible maybe this is the problem. i've done this because i got this errors:

bootstrap-kubernetes-demos up --azure --scf
basename: zusätzlicher Operand »state/systems/scf“
„basename --help“ liefert weitere Informationen.
basename: zusätzlicher Operand »state/systems/scf“
„basename --help“ liefert weitere Informationen.
basename: zusätzlicher Operand »state/systems/scf“
„basename --help“ liefert weitere Informationen.
basename: zusätzlicher Operand »state/systems/scf“
„basename --help“ liefert weitere Informationen.
basename: zusätzlicher Operand »state/systems/scf“
„basename --help“ liefert weitere Informationen.
basename: zusätzlicher Operand »state/systems/scf“
„basename --help“ liefert weitere Informationen.
basename: zusätzlicher Operand »state/systems/scf“
„basename --help“ liefert weitere Informationen.
basename: zusätzlicher Operand »state/systems/scf“
„basename --help“ liefert weitere Informationen.
basename: zusätzlicher Operand »state/systems/scf“
„basename --help“ liefert weitere Informationen.
basename: zusätzlicher Operand »state/systems/scf“
„basename --help“ liefert weitere Informationen.
Verify that the Microsoft.Network, Microsoft.Storage, Microsoft.Compute, and Microsoft.ContainerService providers are enabled:
--> Microsoft.Network: Registered
--> Microsoft.Storage: Registered
--> Microsoft.Compute: Registered
--> Microsoft.ContainerService: Registered
Creating Azure Resource Group...
Location    Name
----------  -----------
westeurope  immi-7wwjnw
immae1 commented 4 years ago

How i have to use the suspend / resume options for azure aks?

bootstrap-infrastructure-azure suspend
bootstrap-infrastructure-azure resume

suspend gives me: ./bin/bootstrap-infrastructure-azure: Zeile 148: NODEPOOL_NAME ist nicht gesetzt.

drnic commented 4 years ago

I'm going to remove suspend/resume from the initial PR; and add them into a new branch/PR.

My reason is that I'm not seeing any VMs so I'm not yet sure what suspend/resume should do, and if/how nodepool name is important?

$ az aks list -o table
Name          Location    ResourceGroup    KubernetesVersion    ProvisioningState    Fqdn
------------  ----------  ---------------  -------------------  -------------------  -------------------------------------------------------------
drnic-qqa0av  westus2     drnic-qqa0av     1.15.4               Succeeded            drnic-qqa0-drnic-qqa0av-e8e346-46826277.hcp.westus2.azmk8s.io
$ az vm list
[]
drnic commented 4 years ago

Ok, this PR is now merged. For any remaining issues or features let's create new issues/pull requests.

@immae1 For the basename issue, can you help me reproduce it please in a new issue? Is there a docker run command that represents the OS distro that you're using that generates the same error?