PostHog / posthog

🦔 PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.
https://posthog.com
Other
21.3k stars 1.27k forks source link

Second, fresh install of PostHog via Helm #12328

Open voarsh2 opened 1 year ago

voarsh2 commented 1 year ago

Bug description

The first time I installed Posthog via Kubernetes helm chart I could change this value, and it worked. I ended up re-installing, exactly the same way, and for some reason I can't change this value.

How to reproduce

  1. Instance Settings
  2. Change RECORDINGS_TTL_WEEKS value
  3. get a 500 error from console and error from network request tab on browser

Environment

Additional context

Version: 1.37.0

Error from posthog_events deployment:

2022-10-18T22:05:58.909644Z [info ] request_started [django_structlog.middlewares.request] ip=142.93.40.177 request=<WSGIRequest: PATCH '/api/instance_settings/RECORDINGS_TTL_WEEKS/'> request_id=aa4ccc9f-0430-41e1-baab-7c93d627fe4c user_agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36
[posthog.exceptions] ip=142.93.40.177 path=/api/instance_settings/RECORDINGS_TTL_WEEKS/ request_id=aa4ccc9f-0430-41e1-baab-7c93d627fe4c
2022-10-18T22:06:07.221997Z [info ] request_finished [django_structlog.middlewares.request] code=500 ip=142.93.40.177 request=<WSGIRequest: PATCH '/api/instance_settings/RECORDINGS_TTL_WEEKS/'> request_id=aa4ccc9f-0430-41e1-baab-7c93d627fe4c user_id=1
Internal Server Error: /api/instance_settings/RECORDINGS_TTL_WEEKS/
Internal Server Error: /api/instance_settings/RECORDINGS_TTL_WEEKS/

chi-posthog statefulset output:

Tue, Oct 18 2022 11:41:19 pm | 2022.10.18 22:41:19.210416 [ 465 ] {a0ac4432-2c30-40d5-88d6-a6a3ff26e188} <Error> executeQuery: Code: 517. DB::Exception: Metadata on replica is not up to date with common metadata in Zookeeper. It means that this replica still not applied some of previous alters. Probably too many alters executing concurrently (highly not recommended). You can retry this error. (CANNOT_ASSIGN_ALTER) (version 22.3.6.5 (official build)) (from 10.42.235.193:47018) (in query: /* request:api_instance_settings_(?P<key>[^_.]+)_?$ (InstanceSettingsViewset) */ ALTER TABLE sharded_session_recording_events MODIFY TTL toDate(created_at) + toIntervalWeek(4)), Stack trace (when copying this message, always include the lines below):

Extra lines:

0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xb37173a in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 1. DB::StorageReplicatedMergeTree::alter(DB::AlterCommands const&, std::__1::shared_ptr<DB::Context const>, std::__1::unique_lock<std::__1::timed_mutex>&) @ 0x160650f1 in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 2. DB::InterpreterAlterQuery::executeToTable(DB::ASTAlterQuery const&) @ 0x159c5177 in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 3. DB::InterpreterAlterQuery::execute() @ 0x159c2e12 in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 4. ? @ 0x15d27b4f in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 5. DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum) @ 0x15d255f5 in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 6. DB::TCPHandler::runImpl() @ 0x168b0a3a in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 7. DB::TCPHandler::run() @ 0x168c0399 in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 8. Poco::Net::TCPServerConnection::start() @ 0x19b801ef in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 9. Poco::Net::TCPServerDispatcher::run() @ 0x19b82641 in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 10. Poco::PooledThread::run() @ 0x19d3f609 in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 11. Poco::ThreadImpl::runnableEntry(void*) @ 0x19d3c960 in /usr/bin/clickhouse
Tue, Oct 18 2022 11:45:30 pm | 12. ? @ 0x7fac3c01d609 in ?
Tue, Oct 18 2022 11:45:30 pm | 13. __clone @ 0x7fac3bf42163 in ?
Tue, Oct 18 2022 11:45:30 pm
voarsh2 commented 1 year ago
2022.10.19 16:58:23.704260 [ 370 ] {b1fb02c6-0067-4d89-a0ae-f0b9bf9aeb35} <Error> executeQuery: Code: 242. DB::Exception: Table is in readonly mode (zookeeper path: /clickhouse/tables/0/posthog.session_recording_events). (TABLE_IS_READ_ONLY) (version 22.3.6.5 (official build)) (from 10.42.189.111:47050) (in query: /* request:api_instance_settings_(?P<key>[^_.]+)_?$ (InstanceSettingsViewset) */ ALTER TABLE sharded_session_recording_events MODIFY TTL toDate(created_at) + toIntervalWeek(4)), Stack trace (when copying this message, always include the lines below):

Posthog keeps outputting this

voarsh2 commented 1 year ago

Probably linked to #12312

guidoiaquinti commented 1 year ago

👋 Hi! Can you please share which version of the Helm chart you are using and your values.yaml redacting any private informations it might contain? Do you have a repro of the issue? Thank you! 🙇

voarsh2 commented 1 year ago

👋 Hi! Can you please share which version of the Helm chart you are using and your values.yaml redacting any private informations it might contain? Do you have a repro of the issue? Thank you! 🙇

I've included more information about the issue in issue (#12312) - it's happened shortly after installing the Helm Chart, and I have reinstalled twice, both times I would get the issue after 1 day of installing Posthog self hosted.

Helm chart: https://posthog.github.io/charts-clickhouse/ Version: 23.5.1

Values:

busybox:
  image: busybox:1.34
cert-manager:
  email: null
  enabled: false
  installCRDs: true
  podDnsConfig:
    nameservers:
      - 8.8.8.8
      - 1.1.1.1
      - 208.67.222.222
  podDnsPolicy: None
clickhouse:
  affinity: {}
  allowedNetworkIps:
    - 10.0.0.0/8
    - 172.16.0.0/12
    - 192.168.0.0/16
  cluster: posthog
  database: posthog
  defaultProfiles:
    default/allow_experimental_window_functions: '1'
    default/allow_nondeterministic_mutations: '1'
  defaultSettings:
    format_schema_path: /etc/clickhouse-server/config.d/
  enabled: true
  image:
    repository: clickhouse/clickhouse-server
    tag: 22.3.6.5
  layout:
    replicasCount: 1
    shardsCount: 1
  namespace: null
  password: a1f31e03-c88e-4ca6-a2df-ad49183d15d9
  persistence:
    enabled: true
    existingClaim: ''
    size: 20Gi
    storageClass: null
  podAnnotations: null
  profiles: {}
  resources: {}
  secure: false
  securityContext:
    enabled: true
    fsGroup: 101
    runAsGroup: 101
    runAsUser: 101
  serviceType: ClusterIP
  settings: {}
  tolerations: []
  user: admin
  verify: false
cloud: private
cloudwatch:
  clusterName: null
  enabled: false
  fluentBit:
    port: 2020
    readHead: 'On'
    readTail: 'Off'
    server: 'On'
  region: null
email:
  existingSecret: ''
  existingSecretKey: ''
  from_email: null
  host: null
  password: null
  port: null
  use_ssl: null
  use_tls: true
  user: null
env: []
events:
  enabled: true
  hpa:
    behavior: null
    cputhreshold: 60
    enabled: false
    maxpods: 10
    minpods: 1
  podSecurityContext:
    enabled: false
  replicacount: 1
  securityContext:
    enabled: false
externalClickhouse:
  cluster: null
  database: posthog
  existingSecret: null
  existingSecretPasswordKey: null
  host: null
  password: null
  secure: false
  user: null
  verify: false
externalKafka:
  brokers: []
externalObjectStorage:
  bucket: null
  endpoint: null
  existingSecret: null
  host: null
  port: null
externalPostgresql:
  existingSecret: null
  existingSecretPasswordKey: postgresql-password
  postgresqlDatabase: null
  postgresqlHost: null
  postgresqlPassword: null
  postgresqlPort: 5432
  postgresqlUsername: null
externalRedis:
  existingSecret: ''
  existingSecretPasswordKey: ''
  host: ''
  password: ''
  port: 6379
externalStatsd:
  host: null
  port: null
grafana:
  datasources:
    datasources.yaml:
      apiVersion: 1
      datasources:
        - access: proxy
          isDefault: true
          name: Prometheus
          type: prometheus
          url: http://posthog-prometheus-server
        - access: proxy
          isDefault: false
          name: Loki
          type: loki
          url: http://posthog-loki:3100
  enabled: false
  sidecar:
    dashboards:
      enabled: true
      folderAnnotation: grafana_folder
      label: grafana_dashboard
      provider:
        foldersFromFilesStructure: true
hooks:
  affinity: {}
  migrate:
    env: []
    resources: {}
image:
  default: ':release-1.37.0'
  pullPolicy: IfNotPresent
  repository: posthog/posthog
  sha: null
  tag: null
ingress:
  annotations: {}
  enabled: false
  gcp:
    forceHttps: true
    ip_name: posthog
    secretName: ''
  hostname: null
  letsencrypt: null
  nginx:
    enabled: false
    redirectToTLS: true
  secretName: null
  type: null
ingress-nginx:
  controller:
    config:
      use-forwarded-headers: 'true'
installCustomStorageClass: false
kafka:
  enabled: true
  externalZookeeper:
    servers:
      - posthog-posthog-zookeeper:2181
  fullnameOverride: ''
  logRetentionBytes: _15_000_000_000
  logRetentionHours: 24
  nameOverride: posthog-kafka
  numPartitions: 1
  persistence:
    enabled: true
    size: 20Gi
  zookeeper:
    enabled: false
  livenessProbe:
    failureThreshold: 60
    initialDelaySeconds: 30
    periodSeconds: 30
    successThreshold: 1
    timeoutSeconds: 2
  readinessProbe:
    failureThreshold: 60
    initialDelaySeconds: 50
    periodSeconds: 30
    successThreshold: 1
    timeoutSeconds: 5
loki:
  enabled: false
migrate:
  enabled: true
minio:
  auth:
    existingSecret: null
    rootPassword: root-password-change-me-please
    rootUser: root-user
  defaultBuckets: posthog
  disableWebUI: true
  enabled: false
  persistence:
    enabled: true
  podAnnotations: null
  service:
    ports:
      api: '19000'
      console: '19001'
notificationEmail: null
pgbouncer:
  enabled: true
  env: []
  extraVolumeMounts: []
  extraVolumes: []
  hpa:
    cputhreshold: 60
    enabled: false
    maxpods: 10
    minpods: 1
  podSecurityContext:
    enabled: false
  replicacount: 1
  securityContext:
    enabled: false
plugins:
  affinity: {}
  enabled: true
  env: []
  hpa:
    behavior: null
    cputhreshold: 60
    enabled: false
    maxpods: 10
    minpods: 1
  livenessProbe:
    failureThreshold: 3
    initialDelaySeconds: 10
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 2
  nodeSelector: {}
  podSecurityContext:
    enabled: false
  readinessProbe:
    failureThreshold: 3
    initialDelaySeconds: 50
    periodSeconds: 30
    successThreshold: 1
    timeoutSeconds: 5
  replicacount: 1
  resources: {}
  securityContext:
    enabled: false
  tolerations: []
pluginsAsync:
  affinity: {}
  enabled: false
  env: []
  hpa:
    behavior: null
    cputhreshold: 60
    enabled: false
    maxpods: 10
    minpods: 1
  livenessProbe:
    failureThreshold: 3
    initialDelaySeconds: 10
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 2
  nodeSelector: {}
  podSecurityContext:
    enabled: false
  readinessProbe:
    failureThreshold: 3
    initialDelaySeconds: 50
    periodSeconds: 30
    successThreshold: 1
    timeoutSeconds: 5
  replicacount: 1
  resources: {}
  securityContext:
    enabled: false
  tolerations: []
postgresql:
  enabled: true
  nameOverride: posthog-postgresql
  persistence:
    enabled: true
    size: 10Gi
  postgresqlDatabase: posthog
  postgresqlPassword: postgres
posthogSecretKey:
  existingSecret: null
  existingSecretKey: posthog-secret
prometheus:
  alertmanager:
    enabled: true
    resources:
      limits:
        cpu: 100m
      requests:
        cpu: 50m
  alertmanagerFiles:
    alertmanager.yml:
      global: {}
      receivers:
        - name: default-receiver
      route:
        group_by:
          - alertname
        receiver: default-receiver
  enabled: false
  kubeStateMetrics:
    enabled: true
  nodeExporter:
    enabled: true
    resources:
      limits:
        cpu: 100m
        memory: 50Mi
      requests:
        cpu: 50m
        memory: 30Mi
  pushgateway:
    enabled: false
  serverFiles:
    alerting_rules.yml:
      groups:
        - name: PostHog alerts
          rules:
            - alert: PodDown
              annotations:
                description: >-
                  Pod {{ $labels.kubernetes_pod_name }} in namespace {{
                  $labels.kubernetes_namespace }} down for more than 5 minutes.
                summary: Pod {{ $labels.kubernetes_pod_name }} down.
              expr: up{job="kubernetes-pods"} == 0
              for: 1m
              labels:
                severity: alert
            - alert: PodFrequentlyRestarting
              annotations:
                description: >-
                  Pod {{$labels.namespace}}/{{$labels.pod}} was restarted
                  {{$value}} times within the last hour
                summary: Pod is restarting frequently
              expr: increase(kube_pod_container_status_restarts_total[1h]) > 5
              for: 10m
              labels:
                severity: warning
            - alert: VolumeRemainingCapacityLowTest
              annotations:
                description: >-
                  Persistent volume claim {{ $labels.persistentvolumeclaim }}
                  disk usage is above 85% for past 5 minutes
                summary: >-
                  Kubernetes {{ $labels.persistentvolumeclaim }} is full (host
                  {{ $labels.kubernetes_io_hostname }})
              expr: >-
                kubelet_volume_stats_used_bytes/kubelet_volume_stats_capacity_bytes
                >= 0.85
              for: 5m
              labels:
                severity: page
prometheus-kafka-exporter:
  annotations:
    prometheus.io/path: /metrics
    prometheus.io/port: '9308'
    prometheus.io/scrape: 'true'
  enabled: false
  image:
    tag: v1.4.2
  kafkaServer:
    - posthog-posthog-kafka:9092
prometheus-postgres-exporter:
  annotations:
    prometheus.io/path: /metrics
    prometheus.io/port: '9187'
    prometheus.io/scrape: 'true'
  config:
    datasource:
      host: posthog-posthog-postgresql
      passwordSecret:
        key: postgresql-password
        name: posthog-posthog-postgresql
      user: postgres
  enabled: false
prometheus-redis-exporter:
  annotations:
    prometheus.io/path: /metrics
    prometheus.io/port: '9121'
    prometheus.io/scrape: 'true'
  enabled: false
  redisAddress: redis://posthog-posthog-redis-master:6379
prometheus-statsd-exporter:
  enabled: false
  podAnnotations:
    prometheus.io/path: /metrics
    prometheus.io/port: '9102'
    prometheus.io/scrape: 'true'
promtail:
  config:
    lokiAddress: http://posthog-loki:3100/loki/api/v1/push
  enabled: false
redis:
  architecture: standalone
  auth:
    enabled: false
    existingSecret: ''
    existingSecretPasswordKey: ''
    password: ''
  enabled: true
  fullnameOverride: ''
  master:
    extraFlags:
      - '--maxmemory 400mb'
      - '--maxmemory-policy allkeys-lru'
    persistence:
      enabled: true
      size: 5Gi
  nameOverride: posthog-redis
saml:
  acs_url: null
  attr_email: null
  attr_first_name: null
  attr_last_name: null
  attr_permanent_id: null
  disabled: false
  enforced: false
  entity_id: null
  x509_cert: null
sentryDSN: null
service:
  annotations: {}
  externalPort: 8000
  internalPort: 8000
  name: posthog
  type: NodePort
serviceAccount:
  annotations: {}
  create: true
  name: null
web:
  affinity: {}
  enabled: true
  env:
    - name: SOCIAL_AUTH_GOOGLE_OAUTH2_KEY
      value: null
    - name: SOCIAL_AUTH_GOOGLE_OAUTH2_SECRET
      value: null
    - name: SOCIAL_AUTH_GOOGLE_OAUTH2_WHITELISTED_DOMAINS
      value: posthog.com
  hpa:
    behavior: null
    cputhreshold: 60
    enabled: false
    maxpods: 10
    minpods: 1
  internalMetrics:
    capture: true
  livenessProbe:
    failureThreshold: 5
    initialDelaySeconds: 50
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 2
  nodeSelector: {}
  podSecurityContext:
    enabled: false
  readinessProbe:
    failureThreshold: 10
    initialDelaySeconds: 50
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 2
  replicacount: 1
  resources: {}
  secureCookies: true
  securityContext:
    enabled: false
  tolerations: []
worker:
  affinity: {}
  enabled: true
  env: []
  hpa:
    behavior: null
    cputhreshold: 60
    enabled: false
    maxpods: 10
    minpods: 1
  nodeSelector: {}
  podSecurityContext:
    enabled: false
  replicacount: 1
  resources: {}
  securityContext:
    enabled: false
  tolerations: []
zookeeper:
  autopurge:
    purgeInterval: 1
  enabled: true
  metrics:
    enabled: false
    service:
      annotations:
        prometheus.io/scrape: 'false'
  nameOverride: posthog-zookeeper
  podAnnotations: null
  replicaCount: 1