kestra-io / plugin-jdbc

https://kestra.io/plugins/
Apache License 2.0
10 stars 7 forks source link

Snowflake query task cannot use files from internal storage: `File not found.` #138

Open anna-geller opened 1 year ago

anna-geller commented 1 year ago

Issue description

Reproducible example:

id: snowflakeTutorial
namespace: dev

tasks:
  - id: createDB
    type: io.kestra.plugin.jdbc.snowflake.Query
    sql: CREATE OR REPLACE DATABASE KESTRA;

  - id: createTable
    type: io.kestra.plugin.jdbc.snowflake.Query
    sql: |
      CREATE OR REPLACE TABLE KESTRA.PUBLIC.EMPLOYEES (
        first_name STRING ,
        last_name STRING ,
        email STRING ,
        streetaddress STRING ,
        city STRING ,
        start_date DATE
        );

  - id: extract
    type: io.kestra.plugin.fs.http.Download
    uri: https://raw.githubusercontent.com/kestra-io/examples/main/datasets/employees01.csv

  - id: load
    type: io.kestra.plugin.jdbc.snowflake.Query
    sql: put file://{{outputs.extract.uri}} @kestra.public.%employees

taskDefaults:
  - type: io.kestra.plugin.jdbc.snowflake.Query
    values:
      url: jdbc:snowflake://accountid.snowflakecomputing.com?warehouse=DEMO
      username: ANNA
      password: 'password'
loicmathieu commented 1 year ago

@anna-geller the Query task uses plain SQL so what you try to achieve cannot be don. Keep in mind that task must handle specifically outputs from internal storage, {{outputs.extract.uri}} will resolve to something like kestra:///namespace/flow/exec/.... For this to work you must use a local file so the WorkingDirectory and the LocalFiles task may allow to do such thing.

Or we may provide a Put task.

loicmathieu commented 1 year ago

By the way, there is an Upload task, does it provide this functionality?

anna-geller commented 1 year ago

it does, I added blueprint for it https://demo.kestra.io/ui/blueprints/community/81

but Ludo agreed that we should support plain PUT command as well because it's more flexible

loicmathieu commented 1 year ago

We should create a dedicated task for that, the Query task doesn't seem to fit the PUT command

anna-geller commented 1 year ago

sure why not

The default Snowflake tutorial runs PUT queries from the SnowSQL CLI so query would work but if you have a specific design in mind for the PUT task, can you share how would it look like? https://docs.snowflake.com/en/user-guide/getting-started-tutorial-stage-data-files

the common use is to run PUT and then COPY INTO which are just SQL statements albeit specific ones I agree