kestra-io / plugin-transform

Transformations plugins for Kestra
https://kestra.io/
Apache License 2.0
0 stars 1 forks source link

Add AlaSQL Transform task #9

Open anna-geller opened 8 months ago

anna-geller commented 8 months ago

Feature description

This might be part of core transform tasks. AlaSQL seems useful and lightweight.

For example, you can download this CSV file and do:

npm install alasql -g 
alasql 'select * into xlsx("new_fruits.xlsx") from CSV("fruit.csv") where id > 98'

This will read a CSV file, perform SQL transformation, and export the output into an Excel file.

Some links:

Trying using it in Node task didn't work for me:

id: alasql
namespace: myteam

tasks:
  - id: extract
    type: io.kestra.plugin.core.http.Download
    uri: https://huggingface.co/datasets/kestra/datasets/resolve/main/csv/fruit.csv

  - id: alasql
    type: io.kestra.plugin.scripts.node.Commands
    inputFiles:
      data.csv: "{{ outputs.extract.uri }}"
    beforeCommands:
      - npm install -g alasql
    commands:
      - alasql 'select * from CSV("data.csv") where id > 98'
anna-geller commented 3 months ago

traceback when using it in Node task:

2024-08-19T11:27:40.811Z DEBUG Using task runner 'io.kestra.plugin.scripts.runner.docker.Docker'
2024-08-19T11:27:40.970Z INFO Provided 1 input(s).
2024-08-19T11:29:44.587Z DEBUG Image pulled [node:latest]
2024-08-19T11:29:45.956Z DEBUG Container created: 907ae99c409ead8a1b107d9d0b2789323b4cab44e982d8a68076c64c416fb710
2024-08-19T11:29:46.114Z DEBUG Volume created: 5a69eb01ff9dc1c4e40d9e34e0c7a61a88e2aa7709a1af4e762088733f5dae1d
2024-08-19T11:29:47.723Z DEBUG Starting command with container id 907ae99c409ead8a1b107d9d0b2789323b4cab44e982d8a68076c64c416fb710 [/bin/sh -c set -e
npm install -g alasql
alasql 'select * from CSV("data.csv") where id > 98']
2024-08-19T11:29:53.595Z INFO 
2024-08-19T11:29:53.606Z INFO added 22 packages in 5s
2024-08-19T11:29:53.620Z INFO 
2024-08-19T11:29:53.635Z INFO 2 packages are looking for funding
2024-08-19T11:29:53.647Z INFO   run `npm fund` for details
2024-08-19T11:29:54.053Z WARN 
2024-08-19T11:29:54.079Z WARN No SQL to process
2024-08-19T11:29:54.091Z WARN 
2024-08-19T11:29:54.105Z WARN AlaSQL command-line utility (version 4.5.0)
2024-08-19T11:29:54.111Z WARN 
2024-08-19T11:29:54.117Z WARN Usage: alasql [options] [sql] [params]
2024-08-19T11:29:54.124Z WARN 
2024-08-19T11:29:54.129Z WARN Options:
2024-08-19T11:29:54.142Z WARN   -v, --version  Echo AlaSQL version                                   [boolean]
2024-08-19T11:29:54.152Z WARN   -m, --minify   Minify json output                                    [boolean]
2024-08-19T11:29:54.165Z WARN   -f, --file     Load SQL from file                                     [string]
2024-08-19T11:29:54.173Z WARN       --ast      Print AST instead of result                            [string]
2024-08-19T11:29:54.180Z WARN   -h, --help     Show help                                             [boolean]
2024-08-19T11:29:54.187Z WARN 
2024-08-19T11:29:54.192Z WARN Examples:
2024-08-19T11:29:54.198Z WARN   alasql "sql-statement"                    Run SQL statement and output result
2024-08-19T11:29:54.205Z WARN                                             as JSON
2024-08-19T11:29:54.212Z WARN 
2024-08-19T11:29:54.218Z WARN   alasql 'value of select 2+?' 40           Outputs 42
2024-08-19T11:29:54.227Z WARN 
2024-08-19T11:29:54.235Z WARN   alasql 'select count(*) from txt()' <     Count lines in city.txt
2024-08-19T11:29:54.244Z WARN   city.txt
2024-08-19T11:29:54.251Z WARN 
2024-08-19T11:29:54.257Z WARN   alasql 'select * into xlsx("city.xlsx")   Convert from txt to xlsx
2024-08-19T11:29:54.263Z WARN   from txt("city.txt")'
2024-08-19T11:29:54.267Z WARN 
2024-08-19T11:29:54.293Z WARN   alasql --file file.sql France 1960        Run SQL from file with 2 parameters
2024-08-19T11:29:54.306Z WARN 
2024-08-19T11:29:54.312Z WARN 
2024-08-19T11:29:54.318Z WARN More information about the library: www.alasql.org
2024-08-19T11:29:54.712Z DEBUG Container deleted: 907ae99c409ead8a1b107d9d0b2789323b4cab44e982d8a68076c64c416fb710
2024-08-19T11:29:54.725Z DEBUG Volume deleted: 5a69eb01ff9dc1c4e40d9e34e0c7a61a88e2aa7709a1af4e762088733f5dae1d
2024-08-19T11:29:54.732Z ERROR Command failed with code 1
2024-08-19T11:29:54.732Z TRACE io.kestra.core.models.tasks.runners.TaskException: Command failed with code 1
    at io.kestra.plugin.scripts.runner.docker.Docker.run(Docker.java:475)
    at io.kestra.plugin.scripts.exec.scripts.runners.CommandsWrapper.run(CommandsWrapper.java:157)
    at io.kestra.plugin.scripts.node.Commands.run(Commands.java:82)
    at io.kestra.plugin.scripts.node.Commands.run(Commands.java:18)
    at io.kestra.core.runners.WorkerTaskThread.doRun(WorkerTaskThread.java:76)
    at io.kestra.core.runners.AbstractWorkerThread.run(AbstractWorkerThread.java:57)