SigNoz / charts

Helm Charts for SigNoz
MIT License
74 stars 78 forks source link

Can't install cold storage on 0.32.2 #370

Open luna215 opened 9 months ago

luna215 commented 9 months ago

When I run this comamand to install signoz 0.32.2 via helm:

helm --namespace platform install signoz-new signoz/signoz -f install.yaml --debug with the following yaml file to override some values:

global:
  storageClass: gp2-resizable
  cloud: aws

clickhouse:
  installCustomStorageClass: true
  coldStorage:
    enabled: true
    defaultKeepFreeSpaceBytes: "10485760"
    endpoint: https://*******.s3.us-west-1.amazonaws.com/data/
    accessKey: ********
    secretAccess: **********
    role:
      enabled: true
      annotations:
        eks.amazonaws.com/role-arn: arn:aws:iam::***:role/***

I get the following error:

install.go:214: [debug] Original chart version: ""
install.go:231: [debug] CHART PATH: /Users/paulluna/Library/Caches/helm/repository/signoz-0.32.2.tgz

coalesce.go:237: warning: skipped value for zookeeper.initContainers: Not a table.
client.go:142: [debug] creating 1 resource(s)
install.go:168: [debug] CRD clickhouseinstallations.clickhouse.altinity.com is already present. Skipping.
client.go:142: [debug] creating 1 resource(s)
install.go:168: [debug] CRD clickhouseinstallationtemplates.clickhouse.altinity.com is already present. Skipping.
client.go:142: [debug] creating 1 resource(s)
install.go:168: [debug] CRD clickhouseoperatorconfigurations.clickhouse.altinity.com is already present. Skipping.
client.go:393: [debug] checking 55 resources for changes
client.go:414: [debug] Created a new ServiceAccount called "signoz-new-clickhouse" in platform

client.go:414: [debug] Created a new ServiceAccount called "signoz-new-clickhouse-operator" in platform

client.go:414: [debug] Created a new ServiceAccount called "signoz-new-k8s-infra-otel-agent" in platform

client.go:414: [debug] Created a new ServiceAccount called "signoz-new-k8s-infra-otel-deployment" in platform

client.go:414: [debug] Created a new ServiceAccount called "signoz-new-alertmanager" in platform

client.go:414: [debug] Created a new ServiceAccount called "signoz-new-frontend" in platform

client.go:414: [debug] Created a new ServiceAccount called "signoz-new-otel-collector-metrics" in platform

client.go:414: [debug] Created a new ServiceAccount called "signoz-new-otel-collector" in platform

client.go:414: [debug] Created a new ServiceAccount called "signoz-new-query-service" in platform

client.go:414: [debug] Created a new Secret called "signoz-new-clickhouse-operator" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-zookeeper-scripts" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-clickhouse-custom-functions" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-clickhouse-operator-etc-confd-files" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-clickhouse-operator-etc-configd-files" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-clickhouse-operator-etc-files" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-clickhouse-operator-etc-templatesd-files" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-clickhouse-operator-etc-usersd-files" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-k8s-infra-otel-agent" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-k8s-infra-otel-deployment" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-frontend" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-otel-collector-metrics" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-otel-collector" in platform

client.go:414: [debug] Created a new ConfigMap called "signoz-new-query-service" in platform

client.go:684: [debug] Looks like there are no changes for StorageClass "gp2-resizable"
client.go:684: [debug] Looks like there are no changes for ClusterRole "signoz-new-k8s-infra-otel-agent-platform"
client.go:684: [debug] Looks like there are no changes for ClusterRole "signoz-new-k8s-infra-otel-deployment-platform"
client.go:684: [debug] Looks like there are no changes for ClusterRole "signoz-new-otel-collector-metrics-platform"
client.go:684: [debug] Looks like there are no changes for ClusterRole "signoz-new-otel-collector-platform"
client.go:684: [debug] Looks like there are no changes for ClusterRoleBinding "signoz-new-k8s-infra-otel-agent-platform"
client.go:684: [debug] Looks like there are no changes for ClusterRoleBinding "signoz-new-k8s-infra-otel-deployment-platform"
client.go:684: [debug] Looks like there are no changes for ClusterRoleBinding "signoz-new-otel-collector-metrics-platform"
client.go:684: [debug] Looks like there are no changes for ClusterRoleBinding "signoz-new-otel-collector-platform"
client.go:414: [debug] Created a new Role called "signoz-new-clickhouse-operator" in platform

client.go:414: [debug] Created a new RoleBinding called "signoz-new-clickhouse-operator" in platform

client.go:414: [debug] Created a new Service called "signoz-new-zookeeper-headless" in platform

client.go:414: [debug] Created a new Service called "signoz-new-zookeeper" in platform

client.go:414: [debug] Created a new Service called "signoz-new-clickhouse-operator-metrics" in platform

client.go:414: [debug] Created a new Service called "signoz-new-k8s-infra-otel-agent" in platform

client.go:414: [debug] Created a new Service called "signoz-new-k8s-infra-otel-deployment" in platform

client.go:414: [debug] Created a new Service called "signoz-new-alertmanager" in platform

client.go:414: [debug] Created a new Service called "signoz-new-alertmanager-headless" in platform

client.go:414: [debug] Created a new Service called "signoz-new-frontend" in platform

client.go:414: [debug] Created a new Service called "signoz-new-otel-collector-metrics" in platform

client.go:414: [debug] Created a new Service called "signoz-new-otel-collector" in platform

client.go:414: [debug] Created a new Service called "signoz-new-query-service" in platform

client.go:414: [debug] Created a new DaemonSet called "signoz-new-k8s-infra-otel-agent" in platform

client.go:414: [debug] Created a new Deployment called "signoz-new-clickhouse-operator" in platform

client.go:414: [debug] Created a new Deployment called "signoz-new-k8s-infra-otel-deployment" in platform

client.go:414: [debug] Created a new Deployment called "signoz-new-frontend" in platform

client.go:414: [debug] Created a new Deployment called "signoz-new-otel-collector-metrics" in platform

client.go:414: [debug] Created a new Deployment called "signoz-new-otel-collector" in platform

client.go:414: [debug] Created a new StatefulSet called "signoz-new-zookeeper" in platform

client.go:414: [debug] Created a new StatefulSet called "signoz-new-alertmanager" in platform

client.go:414: [debug] Created a new StatefulSet called "signoz-new-query-service" in platform

client.go:414: [debug] Created a new ClickHouseInstallation called "signoz-new-clickhouse" in platform

client.go:486: [debug] Starting delete for "signoz-new-schema-migrator" Job
client.go:490: [debug] Ignoring delete failure for "signoz-new-schema-migrator" batch/v1, Kind=Job: jobs.batch "signoz-new-schema-migrator" not found
client.go:142: [debug] creating 1 resource(s)
client.go:712: [debug] Watching for changes to Job signoz-new-schema-migrator with timeout of 5m0s
client.go:740: [debug] Add/Modify event for signoz-new-schema-migrator: ADDED
client.go:779: [debug] signoz-new-schema-migrator: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
Error: INSTALLATION FAILED: failed post-install: 1 error occurred:
        * timed out waiting for the condition

helm.go:84: [debug] failed post-install: 1 error occurred:
        * timed out waiting for the condition

INSTALLATION FAILED
main.newInstallCmd.func2
        helm.sh/helm/v3/cmd/helm/install.go:154
github.com/spf13/cobra.(*Command).execute
        github.com/spf13/cobra@v1.7.0/command.go:940
github.com/spf13/cobra.(*Command).ExecuteC
        github.com/spf13/cobra@v1.7.0/command.go:1068
github.com/spf13/cobra.(*Command).Execute
        github.com/spf13/cobra@v1.7.0/command.go:992
main.main
        helm.sh/helm/v3/cmd/helm/helm.go:83
runtime.main
        runtime/proc.go:267
runtime.goexit
        runtime/asm_arm64.s:1197

Any ideas of what can be causing the installation to error out?

srikanthccv commented 9 months ago

@luna215 Does it consistently fail?

luna215 commented 9 months ago

@srikanthccv yeah it happens consistently. I actually can install signoz 0.32.2 if I only have these settings on my yaml file:

global:
  storageClass: gp2-resizable
  cloud: aws

clickhouse:
  installCustomStorageClass: true

However, when I try to upgrade to add coldStorage with a separate yaml file that contains these configurations:

# cold-storage.yaml
clickhouse:
  installCustomStorageClass: true
  coldStorage:
    enabled: true
    defaultKeepFreeSpaceBytes: "10485760"
    endpoint: https://*******.s3.us-west-1.amazonaws.com/data/
    accessKey: ********
    secretAccess: **********
    role:
      enabled: true
      annotations:
        eks.amazonaws.com/role-arn: arn:aws:iam::***:role/***

using this command: helm --namespace platform upgrade signoz signoz/signoz -f cold-storage.yaml it gives me the same error as above and it breaks my Signoz - as I can't access it anymore and I just get a blank page with the text Something went wrong

prashant-shahi commented 9 months ago

In most cases of migrator job failure, it means either the clickhouse or zookepeer was not healthy.

It would be best to watch the pods in another terminal with kubectl get pods -n platform and do share logs of the failure migrator pod.

However, when I try to upgrade to add coldStorage with a separate yaml file that contains these configurations:

Also, in case of new override values - you are required to merge the yaml with the previously applied one, was that done?

global:
  storageClass: gp2-resizable
  cloud: aws

clickhouse:
  installCustomStorageClass: true
  coldStorage:
    enabled: true
    defaultKeepFreeSpaceBytes: "10485760"
    endpoint: https://*******.s3.us-west-1.amazonaws.com/data/
    accessKey: ********
    secretAccess: **********
    role:
      enabled: true
      annotations:
        eks.amazonaws.com/role-arn: arn:aws:iam::***:role/***

And lastly, I see you have both access/secret key as well as role set up. You will only need one.