cockroachdb / cockroach-operator

k8s operator for CRDB
Apache License 2.0
281 stars 94 forks source link

Parameterize VersionCheck Job resources #968

Open 86label opened 1 year ago

86label commented 1 year ago

The versioncheck job sometimes runs out of memory. This is hardcoded, and should probably be parameterized with a default value of the current.

https://github.com/cockroachdb/cockroach-operator/blob/master/pkg/resource/job.go#L151

image

maitredede commented 1 year ago

Hello, I agree. I can't start a cluster due to this OOMKilled. A describe on resource :

Status:
  Cluster Status:  Failed
  Conditions:
    Last Transition Time:  2023-03-29T23:40:15Z
    Status:                True
    Type:                  Initialized
    Last Transition Time:  2023-03-29T23:38:11Z
    Status:                True
    Type:                  CrdbVersionChecked
    Last Transition Time:  2023-03-29T23:38:13Z
    Status:                True
    Type:                  CertificateGenerated
  Crdbcontainerimage:      cockroachdb/cockroach:v22.2.7
  Operator Actions:
    Last Transition Time:  2023-03-29T23:38:11Z
    Message:               failed to check the version of the cluster
    Status:                Failed
    Type:                  VersionCheckerAction
    Last Transition Time:  2023-03-29T23:38:30Z
    Message:               pod is not running
    Status:                Failed
    Type:                  Initialize
  Version:                 v22.2.7
Shm013 commented 1 year ago

Had the same problem. Complete deletion of resource CrdbCluster and reinstall helped for me.

Ravi-Tripathi21 commented 1 year ago

I am also facing the same issue.

Operator version : v2.11.0 [https://raw.githubusercontent.com/cockroachdb/cockroach-operator/v2.11.0/install/operator.yaml]

Cockroachdb version: v23.1.8

The cockroach-vcheck pods gets OOMKIlled with resource as cpu=300m and memory=256Mi

I am willing to implement it and contribute so please let me know if i can.