dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.24k stars 1.42k forks source link

Config schema validation failure: unexpected field postgres_db.params #4649

Closed pauleikis closed 3 years ago

pauleikis commented 3 years ago

Summary

When running example from https://docs.dagster.io/deployment/guides/kubernetes/deploying-with-helm successfully launched Dagit, but when I try to execute the run, it just hangs in STARTING state.

The problem is threefold:

  1. Worker crashes
  2. There is no indication of that in UI, nor any output/logs are shown from the worker itself
  3. This issue was not caught before release, as it seems to be a simple config schema validation issue from bare Dagster installation.

Reproduction

  1. Run example from https://docs.dagster.io/deployment/guides/kubernetes/deploying-with-helm in a clean environment.
  2. The only diff in values.yaml as per docs
    245c245,247
    <       envSecrets: []
    ---
    >       envSecrets:
    >         - name: dagster-aws-access-key-id
    >         - name: dagster-aws-secret-access-key
    376c378,380
    <       envSecrets: []
    ---
    >       envSecrets:
    >         - name: dagster-aws-access-key-id
    >         - name: dagster-aws-secret-access-key
  3. In Dagit go to example_pipe -> Playground -> Preset: default, Mode: test -> Launch Execution
  4. Observe, that the run is stuck in STARTING state in Runs section
  5. kubectl logs <worker-pod-id> shows the following error
    Traceback (most recent call last):
      File "/usr/local/bin/dagster", line 8, in <module>
        sys.exit(main())
      File "/usr/local/lib/python3.7/site-packages/dagster/cli/__init__.py", line 45, in main
        cli(auto_envvar_prefix=ENV_PREFIX)  # pylint:disable=E1123
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
      File "/usr/local/lib/python3.7/site-packages/dagster/cli/api.py", line 72, in execute_run_command
        else DagsterInstance.get()
      File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 353, in get
        return DagsterInstance.from_config(_dagster_home())
      File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 383, in from_config
        return DagsterInstance.from_ref(instance_ref)
      File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/__init__.py", line 400, in from_ref
        run_storage=instance_ref.run_storage,
      File "/usr/local/lib/python3.7/site-packages/dagster/core/instance/ref.py", line 235, in run_storage
        return self.run_storage_data.rehydrate()
      File "/usr/local/lib/python3.7/site-packages/dagster/serdes/config_class.py", line 83, in rehydrate
        config_dict,
    dagster.core.errors.DagsterInvalidConfigError: Errors whilst loading configuration for <dagster.config.field_utils.Selector object at 0x7f7fedbada10>.
        Error 1: Received unexpected config entry "params" at path root:postgres_db. Expected: "{ db_name: (String | { env: String }) hostname: (String | { env: String }) password: (String | { env: String }) port?: (Int | { env: String }) username: (String | { env: String }) }".
  6. In Dagit Status -> Configuration in fact shows the params field:
    run_storage:
      module: dagster_postgres.run_storage
      class: PostgresRunStorage
      config:
        postgres_db:
          db_name: test
          hostname: dagster-postgresql
          params: {}
          password:
            env: DAGSTER_PG_PASSWORD
          port: 5432
          username: test
  7. I've traced it to this commit which introduces the field https://github.com/dagster-io/dagster/commit/6df86f5c240375ff7a4af4bc92eeafeff657c333 I assume there has to be a schema change too, but I could not find how the schema is defined.

I'm willing to make a PR, but would appreciate some pointers.

Dagit UI/UX Issue Screenshots

Screenshot 2021-08-27 at 10 28 43

Additional Info about Your Environment

Mac OS Big Sur 11.5.2 Dagster 0.12.7


Message from the maintainers:

Impacted by this bug? Give it a 👍. We factor engagement into prioritization.

rexledesma commented 3 years ago

Looking at your screenshot, it looks like the user code container is running on 0.11.3 instead of the expected the 0.12.7 version that you have installed. As a workaround, you should specify the tag as 0.12.7 for the user code deployment so that it pulls the right image.

pauleikis commented 3 years ago

Looking at your screenshot, it looks like the user code container is running on 0.11.3 instead of the expected the 0.12.7 version that you have installed. As a workaround, you should specify the tag as 0.12.7 for the user code deployment so that it pulls the right image.

That was indeed it. I don't know much about helm, but it seems to be picking wrong user-code version by default

➜ helm show chart dagster/dagster

apiVersion: v2
appVersion: 0.12.7
dependencies:
- condition: dagster-user-deployments.enableSubchart
  name: dagster-user-deployments
  repository: ""
  version: 0.11.3
- condition: postgresql.enabled
  name: postgresql
  repository: https://charts.bitnami.com/bitnami
  version: 8.1.0
- condition: rabbitmq.enabled
  name: rabbitmq
  repository: https://charts.bitnami.com/bitnami
  version: 6.16.3
- condition: redis.internal
  name: redis
  repository: https://charts.bitnami.com/bitnami
  version: 12.7.4
description: Dagster is a data orchestrator for machine learning, analytics, and ETL.
icon: https://dagster.io/images/logo.png
keywords:
- analytics
- data-orchestrator
- data-pipelines
- etl
- workflow
kubeVersion: '>= 1.15.0-0'
maintainers:
- email: rex@elementl.com
  name: Rex Ledesma
  url: https://dagster.io
name: dagster
sources:
- https://github.com/dagster-io/dagster/tree/master/helm/dagster
type: application
version: 0.12.7