kestra-io / plugin-jdbc

https://kestra.io/plugins/
Apache License 2.0
13 stars 12 forks source link

The {{ workingDir }} is not rendered correctly in DuckDB Query task when the query is passed from a SQL file stored as a namespace file #300

Open anna-geller opened 6 months ago

anna-geller commented 6 months ago

Expected Behavior

Ideally, the {{ workingDir }} should not be needed. Given that the file is passed as part of inputFiles, you shouldn't need that workingDir part here:

INSTALL json;
LOAD json;
SELECT brand, round(avg(price), 2) as avg_price
FROM read_json_auto('{{ workingDir }}/products.json')
GROUP BY brand
ORDER BY avg_price DESC;

image

reproducer:

id: myflow
namespace: myteam

inputs:
  - id: columns_to_keep
    type: STRING
    defaults: |-
      ["brand", "price"]

tasks:
  - id: extract
    type: io.kestra.plugin.fs.http.Download
    uri: https://dummyjson.com/products

  - id: transform
    type: io.kestra.plugin.scripts.python.Commands
    containerImage: python:3.11-alpine
    inputFiles:
      data.json: "{{ outputs.extract.uri }}"
    outputFiles:
      - "*.json"
    env:
      COLUMNS_TO_KEEP: "{{ inputs.columns_to_keep }}"
    namespaceFiles:
      enabled: true
    commands:
      - python scripts/main.py

  - id: query
    type: io.kestra.plugin.jdbc.duckdb.Query
    inputFiles:
      products.json: "{{ outputs.transform.outputFiles['products.json'] }}"
    sql: |
      INSTALL json;
      LOAD json;
      SELECT brand, round(avg(price), 2) as avg_price
      FROM read_json_auto('{{ workingDir }}/products.json')
      GROUP BY brand
      ORDER BY avg_price DESC;
    store: true

  - id: query_fail
    type: io.kestra.plugin.jdbc.duckdb.Query
    inputFiles:
      products.json: "{{ outputs.transform.outputFiles['products.json'] }}"
    sql: "{{ read('query.sql')}}"
    store: true

Actual Behaviour

No response

Steps To Reproduce

No response

Environment Information

Example flow

No response