kubernetes / k8s.io

Code and configuration to manage Kubernetes project infrastructure, including various *.k8s.io sites
https://git.k8s.io/community/sig-k8s-infra
Apache License 2.0
733 stars 815 forks source link

Cleanup staging buckets in boskos #4691

Open ameukam opened 1 year ago

ameukam commented 1 year ago

Each GCP project in the boskos pool have a staging bucket used to upload binaries built during test execution.

Those buckets were never cleaned up and now impact our budget with the GCS pricing change.

We should clean up those buckets, make them regional and introduce some retention policy (7 days ?).

/sig testing /milestone v1.27

ameukam commented 1 year ago

cc @BenTheElder @thockin

ameukam commented 1 year ago

For a boskos project:

gcloud alpha storage buckets list --project k8s-infra-e2e-boskos-001 --format='table(name,locationType,location)'
NAME                                        LOCATION_TYPE  LOCATION
kubernetes-staging-485128143e               multi-region   US
kubernetes-staging-485128143e-asia          multi-region   ASIA
kubernetes-staging-485128143e-eu            multi-region   EU
kubernetes-staging-485128143e-europe-west6  region         EUROPE-WEST6
gsutil du -sh gs://kubernetes-staging-485128143e
288.45 GiB   gs://kubernetes-staging-485128143e
BenTheElder commented 1 year ago

I think we should set a small TTL of no more than 1 day and rotate these to regional.

We could delete-and-recreate on every run safely in boskos leased projects, but in other fixed projects where CI jobs are still sharing them (5k node scale testing maybe?) we can't do that as easily.

There might be an argument to just update the scripts that ensure they exist to use the new settings when creating and then do a mass deletion of existing buckets one evening ...

ameukam commented 1 year ago

115634 is breaking CI by preventing object creation during job execution.

https://github.com/kubernetes/kubernetes/pull/116222 should help fix this.

At the same time I ran a quick script to clear retention policy around all the buckets. This should be enough fix the issue.

projects=$(gcloud projects list --filter='projectId~^k8s-infra-e2e-boskos-' --format="value(projectId)" --sort-by=projectId)

for prj in ${projects}; do
    bucket=$(gcloud storage buckets list --project "${prj}" --filter='location=us-central1' --format="value(name)")
    if [[ -n "${bucket}" ]]; then
        if ! gsutil ls "${bucket}" >/dev/null 2>&1; then
            echo "clearing retention policy for ${bucket} in ${prj}"
            gsutil retention clear "gs://${bucket}"
        fi
    fi
done
dims commented 1 year ago

thanks @ameukam

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

ameukam commented 1 year ago

/remove-lifecycle stale

Will take a look in 1.29 to remove the multi-regional buckets. /milestone v1.29

ameukam commented 8 months ago

/kind /priority backlog /area infra/gcp /lifecycle frozen

k8s-ci-robot commented 8 months ago

@ameukam: The label(s) kind/backlog cannot be applied, because the repository doesn't have them.

In response to [this](https://github.com/kubernetes/k8s.io/issues/4691#issuecomment-1924051887): >/kind >/priority backlog >/area infra/gcp >/lifecycle frozen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
ameukam commented 7 months ago

/milestone v1.32