kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.93k stars 4.65k forks source link

Support AWS eu-north-1 #6204

Closed bucht closed 5 years ago

bucht commented 5 years ago

Add support for the new AWS region eu-north-1 with 3 AZ: https://aws.amazon.com/blogs/aws/now-open-aws-europe-stockholm-region/

willnewby commented 5 years ago

I'd like to try and work on this, since it seems like a good first-time issue.

@bucht if I run kops create like so:

ZONES="eu-north-1a,eu-north-1b,eu-north-1c"
kops create cluster \
     --cloud aws \
     --zones $ZONES \
     --master-zones $ZONES

Then creating the cluster works for me. However, the --cloud aws flag is required, since kops doesn't currently support inferring AWS from the eu-north-1 zones. However, it looks like AMIs need to be published for that region, since I'm also getting the following

$ kops update cluster <cluster> --yes
. . .
W1212 13:44:11.186966    3301 executor.go:130] error running task "LaunchConfiguration/master-eu-north-1a.masters.<cluster>" (9m52s remaining to succeed): could not find Image for "kope.io/k8s-1.10-debian-jessie-amd64-hvm-ebs-2018-08-17"
. . .
bucht commented 5 years ago

https://github.com/kubernetes/kops/blob/master/upup/pkg/fi/cloud.go Seems to be the magic file for mapping it to AWS :)

willnewby commented 5 years ago

@bucht If you specify the create command the way I did above, subscribe to Debian 9 in the AWS marketplace (https://aws.amazon.com/marketplace/server/procurement?productId=572488bb-fc09-4638-8628-e1e1d26436f4 ) and then use ami-e133bc9f (which is the AMI ID for Debian 9 in eu-north-1), your cluster should work. I'd say use this image until there's an official image posted for KOPS in eu-north-1.

I'm also working on a PR to default the cloud provider with the new AZs.

nikoul commented 5 years ago

@willnewby Using your tip with the Debian 9 image does not work. It got deployed, but in the logs of all nodes I have :

Dec 19 09:08:51 ip-1xx-xx-xx-xx nodeup[602]: W1219 09:08:51.174197 602 main.go:142] got error running nodeup (will retry in 30s): error loading Cluster "s3://KOPS-BUCKET/cluster.spec": eu-north-1 is not a valid region

willnewby commented 5 years ago

@nikoul thanks for the heads up, I'll investigate further.

svanlund commented 5 years ago

I'm quite new to Kubernetes and also interested in setting it up in this new AWS region. The AMI hint got me one step further. Thanks @willnewby! Not quite there yet, but I did some additional troubleshooting that might at least get this issue closer to a resolution. I know that the issue has been taken so to speak, but I'm hoping that my comment is still welcome :)

The "not a valid region" error from nodeup seems to come from the "validateRegion" function at https://github.com/kubernetes/kops/blob/master/util/pkg/vfs/s3context.go#L301. This code relies on the AWS SDK for Go (aws-sdk-go), indicating that Kops itself doesn't hold this information about what is valid and what is not. The new "eu-north-1" region definition was added in the aws-sdk-go v1.16.3 release, see https://github.com/aws/aws-sdk-go/commit/818fa5c274d4dee665096411e2bad037f47c31f7. This particular dependency was bumped to v1.16.9 (past the aws-sdk-go version where "eu-north-1" was introduced) in kops master branch on Dec 20, see https://github.com/kubernetes/kops/commit/ed3101ed1e1695c1b60325ef9f0bc3c2796e61d4#diff-836546cc53507f6b2d581088903b1785.

There has been a Kops release since then (1.10.1 on Dec 24), but I understand that the "nodeup" binary is downloaded by the user-data script executed on the instance after launch. The template is in https://github.com/kubernetes/kops/blob/master/pkg/model/resources/nodeup.go. I didn't have any luck with this recent Kops version so I built "nodeup" myself locally. The "nodeup" binary and "nodeup.sha1" (created manually with output from sha1sum command) were uploaded to a personal S3 bucket, followed by "export NODEUP_URL=https://example-bucket.s3.amazonaws.com/nodeup" in the terminal to get Kops to use my nodeup URL in the user-data script. After creating a new cluster, the "not a valid region" error didn't show up on the new worker node. However, there were now lots of output regarding name lookup failure for "api.internal.example-cluster.k8s.local" (yes, using gossip). That's when I took a break from troubleshooting...

oekarlsson commented 5 years ago

I tried this on the new kops 1.11, but no luck. I could get pass the "image not found" error using the stock debian image, but when the instances comes up it fails to run nodeup with "not a valid region". Can we expect this to work in 1.12 or will there be a workaround that we pehaps can try sooner?

peter-svensson commented 5 years ago

@svanlund I built nodeup (from the master branch) and like you got passed the nodeup step. Now stuck at: error reading addons from "s3://example-bucket/test-sthlm-cluster/addons/bootstrap-channel.yaml": eu-north-1 is not a valid region

This comes from running protokube docker contaner which runs the channels command internally. Building channels on the master branch and inserting that in the container seems to work and my cluster is up and running.. So it should be possible to build a local protokube image and use the PROTOKUBE_IMAGE env variable when creating the cluster... Will continue to investigate.

Update: Works just fine to build a local protokube image, export and upload to a S3 bucket.

nikoul commented 5 years ago

Any news on this ? Do you guys know when it will be possible to run automated kops on this region ? Thanks !

yazzzir commented 5 years ago

I'm running kops 1.11 in eu-north-1. I have tested a coreos ami as well as a debian ami and both have the same behavior.

Everything seems to work OK, however the nodeup doesn't seem to run automatically or hangs? For a 3 master 5 node deployment (gossip DNS), the api ELB reports 0 of 3 healthy masters after cluster creation (I only waited approximately 10min). However when I create a bastion ig, jump into the masters and manually run nodeup on the masters - the ELB starts to report healthy instance.

At which point a kubectl get nodes only returns the masters. Therefore I suspect a manual run of nodeup on the workers will also bring the workers online.

Curious whether I'm missing something. Previously my experience in eu-west-1 was end to end without intervention.

yazzzir commented 5 years ago

I built kops from source (entire stack) Version 1.12.0-alpha.1 (git-743b319fc) and it runs just fine with ami-e133bc9f in eu-north-1 (end to end without intervention), so the next release version will hopefully fix this

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 5 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot commented 5 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 5 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes/kops/issues/6204#issuecomment-515778345): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.