wandb / terraform-aws-wandb

A terraform module for deploying Weights & Biases on AWS.
Apache License 2.0
17 stars 19 forks source link

Wrong Operator CRD spec in Terraform GCP module #270

Closed vijay-wandb closed 6 days ago

vijay-wandb commented 1 week ago

The CRD configuration passed is invalid, which makes the W&B controller not spin up the W&B application

Fix for the this situation:

Extract the Helm values

Create a CRD yaml file

Apply it manually

{"level":"error","ts":"2024-07-10T09:35:42Z","msg":"Failed to apply config changes.","controller":"weightsandbiases","controllerGroup":"apps.wandb.com","controllerKind":"WeightsAndBiases","WeightsAndBiases":{"name":"wandb","namespace":"default"},"namespace":"default","name":"wandb","reconcileID":"32c6383c-9980-4da3-ad0b-e7793b7a4837","error":"cannot patch \"wandb-app\" with kind Deployment: Deployment.apps \"wandb-app\" is invalid: spec.template.spec.containers[0].env[16].valueFrom: Invalid value: \"\": may not be specified whenvalueis not empty","errorVerbose":"cannot patch \"wandb-app\" with kind Deployment: Deployment.apps \"wandb-app\" is invalid: spec.template.spec.containers[0].env[16].valueFrom: Invalid value: \"\": may not be specified whenvalueis not empty\nhelm.sh/helm/v3/pkg/kube.(*Client).Update\n\t/go/pkg/mod/helm.sh/helm/v3@v3.12.0-dev.1/pkg/kube/client.go:441\nhelm.sh/helm/v3/pkg/action.(*Upgrade).releasingUpgrade\n\t/go/pkg/mod/helm.sh/helm/v3@v3.12.0-dev.1/pkg/action/upgrade.go:378\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598","stacktrace":"github.com/wandb/operator/controllers.(*WeightsAndBiasesReconciler).Reconcile\n\t/workspace/controllers/weightsandbiases_controller.go:189\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:235"}

vijay-wandb commented 1 week ago

From Daniel Panzella

Can we get any more info on what configuration was used for the terraform? This clearly doesn’t always produce incorrect output and there have been changes to the output since the ticket was created, so knowing specifically how the TF was instantiated would help debugging

abhinavg6 commented 6 days ago

Not an issue anymore as per Flam