turn down greenhouse - Githubissues

BenTheElder commented 2 years ago

Part of https://github.com/kubernetes/enhancements/issues/2420

test-infra has used RBE instead for some time now, greenhouse was used for Kubernetes builds but no supported branches are anymore. Greenhouse is a pretty large deployment for test-infra (dedicated 32 core VM, 3TB pd-ssd per build cluster), which is rather unjustified without Kubernetes using it anymore.

Additionally, it doesn't appear to be properly auto-deployed anymore? and isn't actively developed / needed by the project at large. We should just turn it down, its time has passed.

Any remaining projects that happen to be using it should fallback to bazel without caching automatically and if they find that they need a cache still they can either spin up a more reasonably sized deployment with SIG k8s-infra (greenhouse is well documented and we'll leave the sources for now) or use some alternate mechanism (e.g. RBE).

/assign

BenTheElder commented 2 years ago

For k8s-prow-builds (the google.com build cluster) I can't find any current mechanism auto-managing anything other than the greenhouse application image so to test turning down the instance I'm going to manually delete the service object such that the instance is soft-turned-down. If all is well we can fully turn it down later.

kubectl get -oyaml svc bazel-cache
apiVersion: v1
kind: Service
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"run":"bazel-cache"},"name":"bazel-cache","namespace":"default"},"spec":{"ports":[{"port":8080,"protocol":"TCP"}],"selector":{"app":"greenhouse"}}}
  creationTimestamp: "2018-02-06T01:25:58Z"
  labels:
    run: bazel-cache
  name: bazel-cache
  namespace: default
  resourceVersion: "322348273"
  selfLink: /api/v1/namespaces/default/services/bazel-cache
  uid: af6fe93d-0adc-11e8-accd-42010a80009c
spec:
  clusterIP: 10.63.246.72
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    app: greenhouse
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

For k8s.io the configs live here: https://github.com/kubernetes/k8s.io/tree/e18c7c6377b78a5b2935bea598e4b75497b2f89c/infra/gcp/terraform/k8s-infra-prow-build/prow-build/resources/default

cc @spiffxp

ameukam commented 2 years ago

/cc

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

BenTheElder commented 2 years ago

/remove-lifecyle rotten /assign @ameukam

ameukam commented 2 years ago

I still see some cache.

/data/kubernetes/kubernetes,b7fa00de3a7c5dec1bb841e81f144ae9 # du -d 1 -h
2.6T    ./cas
555.1M  ./ac
2.6T    .

It's possible it's stale cache. I'll manually clear the cache and see what's happening.

BenTheElder commented 2 years ago

Greenhouse is an LRU cache so it will always be ~full unless it has zero usage since setup.

With a full cache we could check how recently entries were accessed.

ameukam commented 2 years ago

Greenhouse is an LRU cache so it will always be ~full unless it has zero usage since setup.

With a full cache we could check how recently entries were accessed.

Good idea!

/data/kubernetes/kubernetes,b7fa00de3a7c5dec1bb841e81f144ae9/cas # ls -lhur | tail -20
-rw-------    1 root     root         129 Mar  3 15:08 af72d6d45af49bc12c8f24988cda5bbcab607eb1acd0ee0a16017de8329de1c9
-rw-------    1 root     root      163.6M Mar  3 15:08 91f1cd8a5ec9355c6131100df2613216874926fe016331cc275e63e9d8badc47
-rw-------    1 root     root          41 Mar  3 15:08 6f9f6f97af118c28ca02aee095cfb9f5c0642990b8bc0a5a66207043f80436bb
-rw-------    1 root     root      154.5M Mar  3 15:08 54296156a8a8f5489f82dc67e220bd1414e81aeb24c51b2fbb557d0490568a73
-rw-------    1 root     root          33 Mar  3 15:08 2d5f8cdd41a08fb0227bef2feb8afdbf5c3221d2162740deddac689099411c6a
-rw-------    1 root     root      953.1M Mar  3 15:08 ff8c981b8dea03e4619cee094b5a515c949f166d5583c28dfc450330aaa242d5
-rw-------    1 root     root          33 Mar  3 15:08 ad6e165a5b018bb213d3e40d5059797922ffc119398e6eb0c8412d55dc9c0da9
-rw-------    1 root     root         129 Mar  3 15:08 a5b42259f179a25a0588cf0f157e8b69f911d746d0fbdc1e2b0419a0c5c67533
-rw-------    1 root     root          41 Mar  3 15:08 70238e0aaa38087e5f3ff632e6c3a761acd50227a0a745296d1e3282f0975860
-rw-------    1 root     root      318.9M Mar  3 15:08 59f24f968b5d6e141ff700f97ec99a4958e89a25d3ac96a66fd2d16f8138eab2
-rw-------    1 root     root          41 Mar  3 15:08 c2befdb860da789db1e8c73c0e209a70d3c9e3861e9e037dd11cc5c9dfe7c782
-rw-------    1 root     root          33 Mar  3 15:08 15c45ff20039d7024b9a7bec6f6fa0cf14981fc1643c2ff0e995f7b8271ac4a7
-rw-------    1 root     root         129 Mar  3 15:08 0b22393597be5fed56bdf53584bb177cc7eb91f9ca397ce47727c5d7b287b4b3
-rw-------    1 root     root        1.1G Mar  3 15:08 827675094d35b9b5d06bda7e8e75c1aa3a5f83b84f4b12a5bfdf149d1cba9606
-rw-------    1 root     root      453.9M Mar  3 15:08 b09b0db9eec6067069704f8606ee16488f142c2fa7cb8ceba7b0edd8054dad56
-rw-------    1 root     root           0 Mar  3 15:09 e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
-rw-------    1 root     root          33 Mar  3 15:09 e2dc208d675b1645afef7fcae438debb5d12cefb5152e01efa0c118eb1d201e2
-rw-------    1 root     root         129 Mar  3 15:09 be629e40e1a2fe0c8a22d060cb0177f53523a6958c7c366935a52658b4b2a923
-rw-------    1 root     root          41 Mar  3 15:09 483ab076e9a4a36a4184a1c8a6ec82e0bc9453b9a17e6b4dc96af9589320c9ba
-rw-------    1 root     root       60.1K Mar 23 20:58 7fffc697b4402e1f9c9cbdcc8c0cfae028c0fc2fdf7cc63eedbaaed57b075e6e

ameukam commented 2 years ago

We can use GCS as a bazel cache https://docs.bazel.build/versions/main/remote-caching.html#google-cloud-storage

BenTheElder commented 2 years ago

We can use GCS as a bazel cache https://docs.bazel.build/versions/main/remote-caching.html#google-cloud-storage

We don't want to do that because garbage collection is ?? Greenhouse basically exists so we can do LRU of bounded size.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

BenTheElder commented 2 years ago

I think we're ready to do this? /remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

ameukam commented 1 year ago

/remove-lifecycle rotten /lifecycle frozen

ameukam commented 1 day ago

Completely forgot about this. The infrastructure and code are gone.

/close

k8s-ci-robot commented 1 day ago

@ameukam: Closing this issue.

In response to [this](https://github.com/kubernetes/test-infra/issues/24247#issuecomment-2253387861): >Completely forgot about this. >The infrastructure and code are gone. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

kubernetes / test-infra

turn down greenhouse #24247