teamhephy / distribution

Apache License 2.0
0 stars 3 forks source link

Add support for new AWS regions #3

Open jfuechsl opened 4 years ago

jfuechsl commented 4 years ago

The current release of hephy is not able to boot clusters on newer AWS regions (e.g. eu-west-2). To fix this we have to update a few components and dependencies. For this project we need to 1) update aws-sdk-go package and 2) dynamically retrieve S3 regions from the AWS SDK (this is done in upstream as well). I can prepare a PR for this, but I'd like to know against which branch/commit to do this best. I can see in https://github.com/teamhephy/builder/commit/e880c749ee5d9b1cf84522838c43e108f05075ec that reverted the version of this dependency (to a quite old one). @Cryptophobia could you help me with that please?

kingdonb commented 4 years ago

Do you remember what broke here that caused us to revert?

@jfuechsl if you have fixed this in your environment, please go ahead and submit the PR, against the master branch should be fine. We will help you validate it.

I have to apologize that our PR builder CI infrastructure is not quite up to snuff, and so these changes have to be manually validated; this was a topic of discussion at our most recent Open Roadmap. This is something we aim to rectify soon, but for now the PR will have to be manually validated, which it will certainly help if you can say that you are using this in your environment.

Thanks for your contribution!

jfuechsl commented 4 years ago

@kingdonb Thanks for the response. I upgraded against the 0afef00d5764404d70f86076f364551657d51de6 commit, which is the one used by builder et. al. I don't think it's a problem to create it for the main branch, but I haven't tested it. What I can say is, we've been running the upgraded version (against the older commit above) for a while now and seen no issues.

Cryptophobia commented 4 years ago

@jfuechsl thank you for taking a look at this!

If I remember correctly, we reverted to https://github.com/teamhephy/builder/commit/e880c749ee5d9b1cf84522838c43e108f05075ec for the docker distribution package because when we tried to upgrade from upstream repo (https://github.com/docker/distribution) there were a lot of things that changed in the upstream that broke the builder builds. We tried it here: https://github.com/teamhephy/distribution/pull/2

For example, I have some notes that one of the changes was related to the docker context object which is how the docker client from the distribution package finds about the docker daemon and where the containers are running. Not sure at what version, but they change all of how the docker context object works and in what golang interface you can find it. At first it appeared to be a small change, but it turned out we would have to change a lot of the code if we continue with latest https://github.com/teamhephy/distribution from the upstream https://github.com/docker/distribution so we decided to revert because we were in a hurry for a new release.

Basically they changed the way one of the golang interfaces works so we have to migrate in our code to using the new version of that interface. However, when I was working on this I had limited time and was not able to find how the new docker context object was supposed to be used.

Cryptophobia commented 4 years ago

For this project we need to 1) update aws-sdk-go package and 2) dynamically retrieve S3 regions from the AWS SDK (this is done in upstream as well).

I'm all for your change if it has been tested and it works in production environments. :+1: Sounds like you will be changing mostly the golang aws sdk version and the builder custom code to dynamically retrieve the region so we won't be touching any of the docker/distribution code that could break. Ideally we want to tackle the distribution package and fix the builder code to work with the latest from the https://github.com/docker/distribution package at some point.

jfuechsl commented 4 years ago

@Cryptophobia thanks for clarifying. I ran into issues building the head of this repo so for now I'll submit the PR against the currently used version.

jfuechsl commented 4 years ago

I have the update branch ready at https://github.com/jfuechsl/distribution/tree/feat/update-aws-regions. We need a branch with HEAD at 0afef00d5764404d70f86076f364551657d51de6 in this repo to create the PR. Can anyone please create that?

Cryptophobia commented 4 years ago

@jfuechsl, your branch is here: https://github.com/teamhephy/distribution/tree/feature/jfuechsl-update-aws-regions

jfuechsl commented 4 years ago

@Cryptophobia thanks! The PR is submitted.

Cryptophobia commented 4 years ago

@jfuechsl , pushed a commit to fix gofmt of the distribution repo.

Should we make the PRs to update all the required components and add tests for this feature?

Also, how are you running the tests, I get various go errors like this one:

blobs.go:11:2: cannot find package "github.com/docker/distribution/digest" in any of:

This is a bit of a circular dependency here and I remember having to deal with this when I was looking at this repository before. @jfuechsl, Do you think we should do something like this in the Godep dependency management tool?

- package: github.com/docker/distribution
  repo: https://github.com/teamhephy/distribution
  version: 0afef00d5764404d70f86076f364551657d51de6
  vcs: git
jfuechsl commented 4 years ago

@Cryptophobia thanks! Can you run the tests successfully if you put the source into $GOROOT/src/github.com/docker/distribution? Yeah, My list of components to be updated is:

Cryptophobia commented 4 years ago

In regards to Godep, yes I think that's how it should be done.

How is the above configuration done with Godep? I think I took the above from how glide.yaml is set up for another project.

We need to figure out the distribution of the object-storage-cli binary I suppose?

Looks like it was hosted on Google Storage previously. I can create a teamhephy bucket for it and host it again there to make things easier. Let me know when you make the object-storage-cli PR and I will build and host it in our own bucket. Once that PR is merged in, all other PRs can follow after that?

I can start preparing PRs for those now.

Sure, let's start making PRs. What would be the best way to test this change since it requires quite a lot of changes to components and we also need to test in different regions?

kingdonb commented 4 years ago

I agree we should host our own object-storage-cli, I can definitely perform tests on AWS in a couple of different regions, especially the new regions, so we can verify all components of the release have the needed widgets required to run in the AWS regions we've been missing on account of updates.

@Cryptophobia let us know when there's a workflow-beta release please, and we can all have a crack at it. My lab account is pretty restrictive as far as regions go, but there's always personal account, and I think I could talk InfoSec into adding one or two more regions to lab permission boundary at least temporarily, we have a lab environment that exists pretty much just for this purpose.

Cryptophobia commented 4 years ago

@jfuechsl , are you working on making a PR to https://github.com/teamhephy/object-storage-cli first so we can build the object-storage-cli to host ourselves?

Cryptophobia commented 4 years ago

@jfuechsl , are you still working on this? When can we expect the patches to the other components?

jfuechsl commented 4 years ago

@Cryptophobia, sorry for the big delay, I wasn't able to devote any time to this in the past months. I'll continue here where I left off, but it could take a until end of May until I have the PRs ready.

Cryptophobia commented 4 years ago

Okay @jfuechsl , we are looking forward to it whenever it is ready. End of month of May would be ideal.

Just keep in mind we are doing some structural changes to https://github.com/teamhephy/builder/ and to https://github.com/teamhephy/distribution . You may need to rebase and use the new go mod structure for both.

Cryptophobia commented 4 years ago

Also the download link for objstorage-cli is something that broke recently and needs to be looked at: https://github.com/teamhephy/slugbuilder/commit/0042b02baa4af437cff918dbf012239ba0483bb4

Temporarily I have just moved objstorage inside the slugbuilder repo but ideally we should be building and hosting it on S3 somewhere.

jfuechsl commented 4 years ago

Looking at this again, it seems that https://github.com/teamhephy/distribution is fine in regards to what this issue is about. This means we can disregard #4. For https://github.com/teamhephy/object-storage-cli I submitted https://github.com/teamhephy/object-storage-cli/pull/2. We need this for most of the other projects that need updating.

jfuechsl commented 4 years ago

Also submitted https://github.com/teamhephy/registry/pull/6.