Closed apesternikov closed 4 years ago
We are ready to start working on the implementation.
It feels reasonable to try blob implementation from https://github.com/google/go-cloud as abstraction layer. To configure the storage, we can deprecate GCSConfiguration (while keeping compatibility of course) and replace with gs://my-bucket notation as used in go-cloud. Alternatively, we can keep GCSConfiguration and add something like S3Configuration and AzureConfiguration alongside with similar structure to GCSConfiguration. As free benefit, go-cloud also have implementation for file:// and mem:// which could be useful for testing purposes. thoughts?
I would prefer to see a design doc and maybe propose in the weekly sig-testing meeting - this sounds like gonna be a big change.
cc @kubernetes/test-infra-maintainers
I would like to see us move towards more natural structures like gs://bucket/prefix URIs, with or without adding support for other clouds.
I would also like us standardize on using prefix/org/repo rather than the current convoluted strategy (depending on if it's the main repo or in the main org, often not including the org)
More detailed proposal we can review and comment on will be great.
Are we accepting pull request for s3. We can implement blob_store_configuration(interface) to replace the gcs_configuration. With aws as one of implementation.
@rverma-nikiai sure thing! Let's start with a design document that would outline the approach you want to take, focusing on the interface that would unite the storage systems as well as the migration path for current deployments.
@stevekuznetsov may be the suggestion of @apesternikov would be a better approach. I think the target is finally to reduce the direct dependency on package "cloud.google.com/go/storage". We can also implement a generic storage interface to achieve the same. I have created a small doc at https://docs.google.com/document/d/1lAEfYW7xkn1IrW3GKh1WsWb1jxMB4RMrLVsZvXkKBHA/edit.
I would be able to evolve it as per your guidelines :)
Thanks!
The k8s.io/test-infra/pkg/io
package already provides methods to read/write to a path, transparently choosing GCS or the local filesystem as appropriate.
I would like to see this design evolve in that direction, and extend this package to support cloud storage systems beyond GCS
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
@fejta @apesternikov @krzyzacy @rverma-nikiai
I would like to take this over if nobody is currently working on it. If I got it correctly the general direction would be to try to extend the pkg.io.Opener to handle s3 as well as gcs. I would play around with it a bit, try to get a working prototype and open a new proposal for this.
Sure thing! I think you've got the right directon there.
/remove-lifecycle rotten /assign @sbueringer /woof
I have a working prototype for s3 :).
I wrote up a proposal here: https://docs.google.com/document/d/13O1zRX5rpmPCqsPzukavRVqyh4pLp95IeIBVWAZx6Oo/edit?usp=sharing
Someone did btw start working on this for S3 but does not seem to have the time to finish up that work: https://github.com/kubernetes/test-infra/pull/13400 https://kubernetes.slack.com/archives/CDECRSC5U/p1564678637187100?thread_ts=1562678138.497900&cid=CDECRSC5U
@alvaroaleman Thx good to know. Not sure what's easier starting a series of PRs from my working state or going through this huge PR :/
@stevekuznetsov @fejta What would be a useful next step. One idea would be to open a first PR to implement s3 support for Tide, which would be a relatively small step and the design could be further discussed in the PR. What do you think?
Great work on this @sbueringer and all, I was taking a look at your proposal and would be more than happy to help out. I was wondering what the state of your prototype (https://github.com/sbueringer/test-infra/tree/c445) as I get a number of errors when attempting to build with bazel? I'm doing some more testing to make sure it's not just my build and will post with details later. Let me know if theres anything I can help with!
@mikesplain the current state is that we're running it internally with s3 since a few weeks. Not sure about the bazel errors. I'm currently on vacation, but when there's enough support to get this reviewed, I would start opening first PRs early in January (afaik I proposed a few PRs in the doc). If you want to start opening PRs be my guest 😀
Thanks @sbueringer! For those wishing to give this a try, I've gotten it running and plan to start more in depth testing as well.
To fix a few bazel issues, you will likely need to run hack/update-bazel.sh
.
To save others some time, here are some of the steps I took, I'll writeup something more detailed later (I apologize if this is repetitive or off to those who are more familiar with the test-infra codebase. This assumes you've followed the basic getting started guide and have a base understanding of components and how to deploy them) :
# I was unable to compile gerrit and issue-creator on this branch
# so I commented them out in prow/BUILD.bazel
hack/update-bazel.sh
# Point bazel to your docker registry
export PROW_REPO_OVERRIDE=gcr.io/<your repo>/k8s-prow
export EDGE_PROW_REPO=gcr.io/<your repo>/k8s-prow-edge
# Do the below or use prow/push.sh
bazel run //prow:release-push --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64
# Bump your yaml manifests or use `bump.sh` to update your prow.
prow/bump.sh
# Apply manifests or run the bazel command:
bazel run //prow/cluster:production.apply --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64
You'll need to also add aws credentials secret and mount it into deck, tide and a few other components. Right now it still utilizes the gcs flags. Create a secret:
# gcs-credentials secret in your prow namespace and test-pods namespace
{
"region": "us-east-2",
"access_key": "keyhere",
"secret_key": "secrethere",
"endpoint": "s3.amazonaws.com",
"insecure": false,
"s3_force_path_style": true,
"bucket": "s3://s3bucketname"
}
Excited to see a PR to merge that tree in!
Any progress on creating a PR for this issue? We are trying to deploy Prow on an AWS cluster and we would like to use the decorator for our prowjobs but it's failing because of the initupload container requiring a GCS bucket
Unfortunately, I didn't had much time the last few weeks. Goal is to refactor step-by-step. The first PR "only" refactors the pkg/io package and enables tide and status-reconciler to use s3: https://github.com/kubernetes/test-infra/pull/16317
I created a PR in my fork to show the differences to the full-s3-support version I'm using in our environment: https://github.com/sbueringer/test-infra/pull/2 (the full-s3-support version currently completely ignores migration aspects, e.g. configuration, but the main components which I'm using work)
First PR is merged and tide, gerrit & status-reconciler can now be used with S3. The second PR for gcsupload, initupload and sidecar can be found here: https://github.com/kubernetes/test-infra/pull/16722
Great work! Looking forward to getting the rest of this upstream too :D
Next PR is merged. s3 is now also supported by initupload and sidecar. Next PR will be focused on deck/spyglass. This makes it possible to use all core-components with S3 ((I guess core depends on point of view).
@fejta @alvaroaleman What other components are currently using GCS? I only know about crier.
yeah, Deck/Spyglass and crier seem to be the remaining ones:
$ ack -Q cloud.google.com/go/storage prow/
prow/cmd/deck/main.go
38: "cloud.google.com/go/storage"
prow/cmd/deck/job_history.go
32: "cloud.google.com/go/storage"
prow/cmd/deck/pr_history.go
30: "cloud.google.com/go/storage"
prow/cmd/crier/main.go
26: "cloud.google.com/go/storage"
prow/cmd/gerrit/main_test.go
31: "cloud.google.com/go/storage"
prow/crier/reporters/gcs/internal/util/util.go
26: "cloud.google.com/go/storage"
prow/crier/reporters/gcs/kubernetes/reporter.go
34: "cloud.google.com/go/storage"
prow/crier/reporters/gcs/reporter.go
27: "cloud.google.com/go/storage"
prow/pod-utils/gcs/upload.go
28: "cloud.google.com/go/storage"
prow/spyglass/gcsartifact_test.go
27: "cloud.google.com/go/storage"
prow/spyglass/testgrid.go
25: "cloud.google.com/go/storage"
prow/spyglass/spyglass.go
31: "cloud.google.com/go/storage"
prow/spyglass/gcsartifact.go
25: "cloud.google.com/go/storage"
prow/spyglass/gcsartifact_fetcher.go
32: "cloud.google.com/go/storage"
prow/tide/history/history_test.go
29: "cloud.google.com/go/storage"
Hi @sbueringer,
First of all amazing job on implementing https://github.com/kubernetes/test-infra/pull/16722. If you don't mind, would you please support default credential provider chain (see diff below)? Static credential is not possible in our case since we only have federated account with MFA enabled. We heavily rely on the built in providers.
----------------------- prow/io/providers/providers.go ------------------------
index eac46a8c0..3911918a9 100644
@@ -94,11 +94,13 @@ func getS3Bucket(ctx context.Context, creds []byte, bucketName string) (*blob.Bu
if err := json.Unmarshal(creds, s3Credentials); err != nil {
return nil, fmt.Errorf("error getting S3 credentials from JSON: %v", err)
}
-
- staticCredentials := credentials.NewStaticCredentials(s3Credentials.AccessKey, s3Credentials.SecretKey, "")
+ var configCreds *credentials.Credentials
+ if s3Credentials.AccessKey != "" && s3Credentials.SecretKey != "" {
+ configCreds = credentials.NewStaticCredentials(s3Credentials.AccessKey, s3Credentials.SecretKey, "")
+ }
sess, err := session.NewSession(&aws.Config{
- Credentials: staticCredentials,
+ Credentials: configCreds,
Endpoint: aws.String(s3Credentials.Endpoint),
DisableSSL: aws.Bool(s3Credentials.Insecure),
S3ForcePathStyle: aws.Bool(s3Credentials.S3ForcePathStyle),
@ramirezag I assume you're using Prow in AWS. Did you try to use it without specifying a s3-credentials-file
? I couldn't try it, but it should auto-discover everything (should be the default in gocloud: https://github.com/kubernetes/test-infra/blob/master/prow/io/providers/providers.go#L71). The io
package falls back to gocloud default discovery if no s3-credentials-file
is specified.
Of course this assumes that you don't need anything else from the configuration file
@sbueringer yes we're using prow in AWS. Ok I will test without specifying s3-credentials-file
.
@sbueringer GCS or S3 credentials is required when using pod-utilities.
@ramirezag I think that check should be removed. Can you test against a version without this check?
@sbueringer I honestly haven't figured out how to build or generate prow docker images - I'm still awaiting for a reply in #prow channel. So I cannot test what you suggested.
Removed if o.GcsCredentialsFile == "" && o.S3CredentialsFile == ""
in prow/gcsupload/options.go
and prow/apis/prowjobs/v1/types.go
and now I get
{"component":"initupload","error":"failed to upload to blob storage: failed to upload to blob storage: encountered errors during upload: [writer close error: blob (code=Unknown): MissingRegion: could not find region configuration writer close error: blob (code=Unknown): MissingRegion: could not find region configuration upload error: writer close error: blob (code=Unknown): MissingRegion: could not find region configuration]","file":"prow/cmd/initupload/main.go:45","func":"main.main","level":"fatal","msg":"Failed to initialize job","time":"2020-04-13T20:35:30Z"}
This is my config:
plank:
pod_pending_timeout: 15m
pod_unscheduled_timeout: 1m
default_decoration_configs:
'*':
timeout: 2h
grace_period: 15s
utility_images:
clonerefs: "{{ .Values.prow_component_image_repo }}/clonerefs:{{ .Values.prow_component_image_tag }}"
initupload: "{{ .Values.prow_component_image_repo }}/initupload:{{ .Values.prow_component_image_tag }}"
entrypoint: "{{ .Values.prow_component_image_repo }}/entrypoint:{{ .Values.prow_component_image_tag }}"
sidecar: "{{ .Values.prow_component_image_repo }}/sidecar:{{ .Values.prow_component_image_tag }}"
gcs_configuration:
bucket: s3://somebucket?region=eu-central-1
path_strategy: "single"
default_org: org
default_repo: repo
@ramirezag can you please open a dedicated issue for S3 default credential provider chain and keep this to "Add S3/Azure support to Prow where we currently support GCS only"?
prow: deck: add s3 support #17183 is almost merged
Some followups for the deck PR:
finished.json
PRNew PR for crier: prow: crier: add s3 support #17618
Hello! Ridiculously excited at the prospects of trying out prow in AWS, had my eye on this effort for a while.
Looks like everything pending has been merged, curious if it's pretty much good to go now?
@jhohertz yes, S3 support is finished, check out config/prow/cluster/starter-s3.yaml
.
/retitle S3 support in Prow
@apesternikov I will close this issue as the S3 support is finished. If you (or anyone else for that matter) is still interested in Azure support, please open a dedicated issue for that. It should be much easier now that we have more than one provider supported already.
/close
@alvaroaleman: Closing this issue.
@jhohertz Yup should be good to go. I, and I guess a few other people, are running this for months now with S3 (without any custom patches). I'm only using it with Minio but we had 1-2 PRs for AWS so I guess it also works there.
We've been using it with native aws/s3 for a few months without issue, thanks to everyone for the hard work!
Awesome! Thanks for the pointers on getting started, look forward to diving in.
Currently GCS is the only option to use for logs in prow. What would you like to be added: It would be very useful to be able to use AWS/Azure blob services in addition to GCS. Why is this needed: Some companies are using non-GCS blob storage extensively. It would be convenient if they can extend its use to prow. Also, certain corporate IT policies/restrictions may prevent corporate users from adopting prow as CICD solution.