kestra-io / plugin-dbt

Apache License 2.0
8 stars 4 forks source link

Cannot overwrite dbt profiles with KubernetesTaskRunner as DbtCLI looks for them in /root/.dbt instead of the current wdir #114

Closed anna-geller closed 2 months ago

anna-geller commented 4 months ago

The task fails with an Error: Invalid value for '--profiles-dir': Path '/root/.dbt' does not exist.

The solution might be adding --profiles-dir {{workingDir}} by default.

Reproducer

To reproduce, first download this repo as zip file https://github.com/kestra-io/dbt-example

then upload as namespace files to the dwh namespace and rename the dir to dbt:

id: dbt_kubernetes
namespace: dwh
description: Download https://github.com/kestra-io/dbt-example as a zipfile and import to the Namespace Files editor.
tasks:
  - id: dbt_build
    type: io.kestra.plugin.dbt.cli.DbtCLI
    namespaceFiles:
      enabled: true
    containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
    commands:
      - dbt build --project-dir {{workingDir}}/dbt/
    profiles: |
      my_dbt_project:
        outputs:
          dev:
            type: duckdb
            path: ":memory:"
        target: dev

taskDefaults:
  - type: io.kestra.plugin.dbt.cli
    values:
      taskRunner:
        type: io.kestra.plugin.kubernetes.runner.KubernetesTaskRunner
        config:
          # masterUrl: https://docker-for-desktop:6443 # when running kestra in docker
          masterUrl: https://127.0.0.1:6443
          clientKey: xxx==
          clientCert: xxx==
          caCert: xxx
          username: docker-desktop
          namespace: default

Actual Behaviour

No response

Steps To Reproduce

No response

Environment Information

Example flow

No response

loicmathieu commented 2 months ago

Hi, I'm not sure I understood your example, there is no dbt dir here so it's normal that it complains for it.

anna-geller commented 2 months ago

@loicmathieu plugin defaults are ignored even with forced set to True — kestra still runs dbt in Docker rather than in a K8s pod:

id: dbt_kubernetes
namespace: dwh
description: Download https://github.com/kestra-io/dbt-example as a zipfile and import to the Namespace Files editor.
tasks:
  - id: dbt_build
    type: io.kestra.plugin.dbt.cli.DbtCLI
    namespaceFiles:
      enabled: true
    containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
    projectDir: dbt
    commands:
      - dbt build --project-dir {{workingDir}}/dbt/
    profiles: |
      my_dbt_project:
        outputs:
          dev:
            type: duckdb
            path: ":memory:"
        target: dev

pluginDefaults:
  - type: io.kestra.plugin.dbt.cli
    forced: true
    values:
      taskRunner:
        type: io.kestra.plugin.kubernetes.runner.Kubernetes
        namespace: default
        pullPolicy: ALWAYS
        config:
          username: docker-desktop
          masterUrl: https://docker-for-desktop:6443
          caCertData: |-
            xxx
          clientCertData: |-
            xxx==
          clientKeyData: |-
            xxx=

Even when putting the K8s config into the dbt task directly, it still runs in docker, not in a pod:

id: dbt_kubernetes
namespace: dwh
description: Download https://github.com/kestra-io/dbt-example as a zipfile and import to the Namespace Files editor. Then rename that folder to dbt.
tasks:
  - id: dbt_build
    type: io.kestra.plugin.dbt.cli.DbtCLI
    namespaceFiles:
      enabled: true
    containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
    projectDir: dbt
    commands:
      - dbt build --project-dir {{workingDir}}/dbt/
    profiles: |
      my_dbt_project:
        outputs:
          dev:
            type: duckdb
            path: ":memory:"
        target: dev
    taskRunner:
      type: io.kestra.plugin.kubernetes.runner.Kubernetes
      namespace: default
      pullPolicy: ALWAYS
      config:
        username: docker-desktop
        masterUrl: https://docker-for-desktop:6443
        caCertData: |-
          xxx
        clientCertData: |-
          xxx==
        clientKeyData: |-
          xxx=
loicmathieu commented 2 months ago

OK, so this is a different issue than the one I suspected. This has nothing to do with DBT or the Kubernetes task runner but with plugin defaults. The current work for switching to apply plugin default on flow source should fix the issue.

By the way, this works so I'm closing it as it's already reported twice in the Kestra repository so we don't need a third one.

id: dbt_kubernetes
namespace: dwh
description: Download https://github.com/kestra-io/dbt-example as a zipfile and import to the Namespace Files editor.
tasks:
  - id: dbt_build
    type: io.kestra.plugin.dbt.cli.DbtCLI
    namespaceFiles:
      enabled: true
    containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
    taskRunner:
      type: io.kestra.plugin.kubernetes.runner.Kubernetes
    commands:
      - dbt build --project-dir {{workingDir}}/dbt/
    projectDir: dbt
    profiles: |
      my_dbt_project:
        outputs:
          dev:
            type: duckdb
            path: ":memory:"
        target: dev
loicmathieu commented 2 months ago

This is a plugin defaults issue, duplicated with https://github.com/kestra-io/kestra/issues/2797 and https://github.com/kestra-io/kestra/issues/2260

anna-geller commented 2 months ago

agree about plugin defaults if there is an issue but this example below really doesn't work for me -- I'm not using plugin defaults here and dbt still runs in docker rather than K8s -- if you cannot reproduce, let's jump to a Slack huddle to troubleshoot?

id: dbt_kubernetes
namespace: dwh
description: Download https://github.com/kestra-io/dbt-example as a zipfile and import to the Namespace Files editor. Then rename that folder to dbt.
tasks:
  - id: dbt_build
    type: io.kestra.plugin.dbt.cli.DbtCLI
    namespaceFiles:
      enabled: true
    containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
    projectDir: dbt
    commands:
      - dbt build --project-dir {{workingDir}}/dbt/
    profiles: |
      my_dbt_project:
        outputs:
          dev:
            type: duckdb
            path: ":memory:"
        target: dev
    taskRunner:
      type: io.kestra.plugin.kubernetes.runner.Kubernetes
      namespace: default
      pullPolicy: ALWAYS
      config:
        username: docker-desktop
        masterUrl: https://docker-for-desktop:6443
        caCertData: |-
          xxx
        clientCertData: |-
          xxx==
        clientKeyData: |-
          xxx=