acryldata / datahub-helm

Repository of helm charts for deploying DataHub on a Kubernetes cluster
Apache License 2.0
163 stars 241 forks source link

EBEAN env vars duplicated on datahub-restore-indices-job-template #261

Closed matthijsvanderloos closed 1 year ago

matthijsvanderloos commented 1 year ago

Describe the bug The following EBEAN_* environment variables are duplicated on the datahub-restore-indices-job-template CronJob:

To Reproduce Run helm template on the Datahub chart with default values. The output for the CronJob is (note duplicated env vars):

---
# Source: datahub/templates/datahub-upgrade/datahub-restore-indices-job-template.yml
# Job template for restoring indices by sending MAE corresponding to all entities in the local db
# Creates a suspended cronJob that you can use to create an adhoc job when ready to run clean up.
# Run the following command to do so
# kubectl create job --from=cronjob/<<release-name>>-datahub-restore-indices-job-template datahub-restore-indices-job
apiVersion: batch/v1
kind: CronJob
metadata:
  name: release-name-datahub-restore-indices-job-template
  labels:
    app.kubernetes.io/managed-by: "Helm"
    app.kubernetes.io/instance: "release-name"
    app.kubernetes.io/version: 0.10.0
    helm.sh/chart: "datahub-0.2.150"
spec:
  schedule: "* * * * *"
  suspend: true
  jobTemplate:
    spec:
      template:
        spec:
          volumes:
          restartPolicy: Never
          securityContext:
            {}
          initContainers:
          containers:
            - name: datahub-upgrade-job
              image: "acryldata/datahub-upgrade:v0.10.0"
              imagePullPolicy: IfNotPresent
              args:
                - "-u"
                - "RestoreIndices"
                - "-a"
                - "batchSize=1000"
                - "-a"
                - "batchDelayMs=100"
              env:
                - name: EBEAN_DATASOURCE_USERNAME
                  value: "root"
                - name: EBEAN_DATASOURCE_PASSWORD
                  valueFrom:
                    secretKeyRef:
                      name: "mysql-secrets"
                      key: "mysql-root-password"
                - name: EBEAN_DATASOURCE_HOST
                  value: "prerequisites-mysql:3306"
                - name: EBEAN_DATASOURCE_URL
                  value: "jdbc:mysql://prerequisites-mysql:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8&enabledTLSProtocols=TLSv1.2"
                - name: EBEAN_DATASOURCE_DRIVER
                  value: "com.mysql.cj.jdbc.Driver"
                - name: ENTITY_REGISTRY_CONFIG_PATH
                  value: /datahub/datahub-gms/resources/entity-registry.yml
                - name: DATAHUB_GMS_HOST
                  value: release-name-datahub-gms
                - name: DATAHUB_GMS_PORT
                  value: "8080"
                - name: DATAHUB_MAE_CONSUMER_HOST
                  value: release-name-datahub-mae-consumer
                - name: DATAHUB_MAE_CONSUMER_PORT
                  value: "9091"
                - name: EBEAN_DATASOURCE_USERNAME
                  value: "root"
                - name: EBEAN_DATASOURCE_PASSWORD
                  valueFrom:
                    secretKeyRef:
                      name: "mysql-secrets"
                      key: "mysql-root-password"
                - name: EBEAN_DATASOURCE_HOST
                  value: "prerequisites-mysql:3306"
                - name: EBEAN_DATASOURCE_URL
                  value: "jdbc:mysql://prerequisites-mysql:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8&enabledTLSProtocols=TLSv1.2"
                - name: EBEAN_DATASOURCE_DRIVER
                  value: "com.mysql.cj.jdbc.Driver"
                - name: KAFKA_BOOTSTRAP_SERVER
                  value: "prerequisites-kafka:9092"
                - name: KAFKA_SCHEMAREGISTRY_URL
                  value: "http://prerequisites-cp-schema-registry:8081"
                - name: ELASTICSEARCH_HOST
                  value: "elasticsearch-master"
                - name: ELASTICSEARCH_PORT
                  value: "9200"
                - name: SKIP_ELASTICSEARCH_CHECK
                  value: "false"
                - name: ELASTICSEARCH_INSECURE
                  value: "false"
                - name: ELASTICSEARCH_USE_SSL
                  value: "false"
                - name: GRAPH_SERVICE_IMPL
                  value: elasticsearch
                - name: METADATA_CHANGE_EVENT_NAME
                  value: MetadataChangeEvent_v4
                - name: FAILED_METADATA_CHANGE_EVENT_NAME
                  value: FailedMetadataChangeEvent_v4
                - name: METADATA_AUDIT_EVENT_NAME
                  value: MetadataAuditEvent_v4
                - name: METADATA_CHANGE_PROPOSAL_TOPIC_NAME
                  value: MetadataChangeProposal_v1
                - name: FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME
                  value: FailedMetadataChangeProposal_v1
                - name: METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME
                  value: MetadataChangeLog_Versioned_v1
                - name: METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME
                  value: MetadataChangeLog_Timeseries_v1
                - name: DATAHUB_UPGRADE_HISTORY_TOPIC_NAME
                  value: DataHubUpgradeHistory_v1
              securityContext:
                {}
              volumeMounts:
              resources:
                limits:
                  cpu: 500m
                  memory: 512Mi
                requests:
                  cpu: 300m
                  memory: 256Mi

Expected behavior EBEAN_* environment variables should only appear once.

Additional context Relevant code: https://github.com/acryldata/datahub-helm/blob/d68bf2c87c263e0d97446432b812df5cc561d21e/charts/datahub/templates/datahub-upgrade/datahub-restore-indices-job-template.yml#L75-L93

MioOgbeni commented 1 year ago

Same bug is at datahub-system-update-job CronJob

matthijsvanderloos commented 1 year ago

@MioOgbeni indeed!

image

upendra-vedullapalli commented 1 year ago

I see that all the duplicates are coming from this template {{- include "datahub.upgrade.env" . | nindent nn}} from here in all the Jobs and CronJobs wherever it is included

  1. xxx-nocode-migration-job
  2. xxx-datahub-cleanup-job-template
  3. xxx-datahub-restore-indices-job-template
  4. xxx-datahub-system-update-job
upendra-vedullapalli commented 1 year ago

There are other non-EBEAN* duplicates as well to be removed as reported in this comment Is there any issue to address that?