kestra-io / plugin-dbt

Apache License 2.0
8 stars 4 forks source link

Fix DbtCLI task unable to work with `workingDir` within Docker Task runner. #145

Open noyb34 opened 1 month ago

noyb34 commented 1 month ago

Expected Behavior

The copy (cp) command within dbtCLI task should be able to copy a file in the working directory {{ workingDir }} to be consumed by downstream task. See example flow below.

Actual Behaviour

When I run the flow, I get the following error: Unable to find workingDir used in the expression cp {{workingDir}}/my_dbt_project/target/manifest.json {{workingDir}}}/manifest.json at line 1

2024-10-03 16:20:06.793io.kestra.core.exceptions.IllegalVariableEvaluationException: Unable to find workingDir used in the expression cp {{workingDir}}/my_dbt_project/target/manifest.json {{workingDir}}}/manifest.json at line 1 at io.kestra.core.runners.VariableRenderer.properPebbleException(VariableRenderer.java:59) at io.kestra.core.runners.VariableRenderer.renderOnce(VariableRenderer.java:120) at io.kestra.core.runners.VariableRenderer.render(VariableRenderer.java:91) at io.kestra.core.runners.VariableRenderer.render(VariableRenderer.java:76) at io.kestra.core.runners.VariableRenderer.render(VariableRenderer.java:241) at io.kestra.core.runners.VariableRenderer.render(VariableRenderer.java:235) at io.kestra.core.runners.DefaultRunContext.render(DefaultRunContext.java:190) at io.kestra.plugin.dbt.cli.DbtCLI.run(DbtCLI.java:270) at io.kestra.plugin.dbt.cli.DbtCLI.run(DbtCLI.java:39) at io.kestra.core.runners.WorkerTaskThread.doRun(WorkerTaskThread.java:76) at io.kestra.core.runners.AbstractWorkerThread.run(AbstractWorkerThread.java:57) Caused by: io.pebbletemplates.pebble.error.RootAttributeNotFoundException: Root attribute [workingDir] does not exist or can not be accessed and strict variables is set to true. (cp {{workingDir}}/my_dbt_project/target/manifest.json {{workingDir}}}/manifest.json:1) at io.pebbletemplates.pebble.node.expression.ContextVariableExpression.evaluate(ContextVariableExpression.java:44) at io.pebbletemplates.pebble.node.PrintNode.render(PrintNode.java:37) at io.pebbletemplates.pebble.node.BodyNode.render(BodyNode.java:44) at io.pebbletemplates.pebble.node.RootNode.render(RootNode.java:31) at io.pebbletemplates.pebble.template.PebbleTemplateImpl.evaluate(PebbleTemplateImpl.java:157) at io.pebbletemplates.pebble.template.PebbleTemplateImpl.evaluate(PebbleTemplateImpl.java:96) at io.kestra.core.runners.VariableRenderer.renderOnce(VariableRenderer.java:114) ... 9 more

Steps To Reproduce

1- run the flow 2- see error

Environment Information

Example flow


  - id: working-directory
    type: io.kestra.plugin.core.flow.WorkingDirectory
    outputFiles:
      - manifest.json
    tasks:
      - id: cloneRepository
        type: io.kestra.plugin.git.Clone
        username: my_username
        password: "{{ secret('GITHUB_TOKEN') }}"
        url: https://github.com/org/my-dbt-models
        branch: main

      - id: download_manifest_from_s3
        type: "io.kestra.plugin.aws.cli.AwsCLI"
        accessKeyId: "XXXXXXX"
        secretKeyId: "XXXXXXX"
        region: "us-east-1"
        commands:
          - aws s3 cp s3://my_bucket/manifest/manifest.json .
        outputFiles:
          - "manifest.json"

      - id: dbt_project_build
        type: io.kestra.plugin.dbt.cli.DbtCLI
        parseRunResults: false
        taskRunner:
          type: io.kestra.plugin.scripts.runner.docker.Docker
          memory:
            memory: 1GB
          pullPolicy: IF_NOT_PRESENT
        containerImage: python:3.9-slim
        projectDir: /my_dbt_project
        beforeCommands:
          - pip install uv --disable-pip-version-check
          - uv venv --quiet
          - . .venv/bin/activate --quiet
          - uv pip install --quiet dbt-core==1.8.2 dbt-postgres==1.8.2 --disable-pip-version-check
          - mkdir ./my_dbt_project/target
          - cp {{ outputs.download_manifest_from_s3["outputFiles"]["manifest.json"] }} ./my_dbt_project/target # this works
        profiles: |
          my_dbt_project:
            outputs:
              dev:
                type: postgres
                user: my_username
                dbname: datawarehouse_db
                host: my_datawarehouse
                pass: 'changeme'
                port: 50432
                schema: my_preprod_schema
                threads: 4
            target: dev
        commands:
          - dbt deps --project-dir my_dbt_project
          - dbt build --project-dir my_dbt_project
          - cp {{ workingDir }}/my_dbt_project/target/manifest.json {{ workingDir }}/manifest.json  # This line does not work
        outputFiles:
          - manifest.json

      - id: upload-manifest-to-s3
        type: io.kestra.plugin.aws.s3.Upload
        accessKeyId: "XXXXXXX
        secretKeyId: "XXXXXXX"
        region: "us-east-1"
        from: "{{outputs.dbt_project_build['outputFiles']['manifest.json']}}"
        bucket: "my_dbt_project"
        key: target/manifest.json```
anna-geller commented 1 month ago

hey, we plan to tackle the manifest issue so that hopefully you don't have to use the extra copy command - feel free to give feedback on the suggested design https://github.com/kestra-io/plugin-dbt/issues/45

Ben8t commented 2 weeks ago

@noyb34 #45 has been merged and can be tested on kestra:develop image. Would you be open to test it and let us know if it improve the workflow here :) ?

noyb34 commented 1 week ago

@Ben8t. Awesome. I'll do some tests and report back shortly. Thanks