Closed MarthaScheffler closed 3 weeks ago
interesting issue - we run our preview env on K8s GKE with GCS storage (almost the same API as S3) and I couldn't reproduce the issue there:
we'll investigate more and try to find a fix once we can reproduce.
flow for folks who try to reproduce:
id: dbt_duckdb_repro
namespace: company.team
tasks:
- id: dbt
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-example
branch: main
- id: dbt_build
type: io.kestra.plugin.dbt.cli.DbtCLI
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
commands:
- dbt deps
- dbt build
profiles: |
my_dbt_project:
outputs:
dev:
type: duckdb
path: ":memory:"
fixed_retries: 1
threads: 16
timeout_seconds: 300
target: dev
- id: upload
type: io.kestra.plugin.core.namespace.UploadFiles
filesMap: "{{ outputs.dbt_build.outputFiles }}"
namespace: "{{ flow.namespace }}"
Hi, Wich version exactly are you using? We fixed an S3 storage issue in 0.18.3 so if you didn't use this version can you try it?
Hi, Wich version exactly are you using? We fixed an S3 storage issue in 0.18.3 so if you didn't use this version can you try it?
I tried with 0.18.2., 0.18.3. and then downgraded to 0.17 (not sure which subversion) - same empty files.
Hello ! I just tried on 0.18.4 with S3 storage (but 0.18.3 should be the same as there was no change on storage) and everything work. Did you change the namespace
property in UploadFiles task ? I forgot to do so so I thought I reproduced but moving to the proper namespace (or {{flow.namespace}}) made it work.
Screencast from 2024-08-30 09-49-21.webm
on my hand to reproduce 👍 will update here
I'm able to reproduce on 0.19.1 with S3 internal storage:
Everything seems to work but in the end the NamespaceFiles are empty (0 bytes on S3, so logically empty in Kestra)
id: dbt_duckdb_repro
namespace: company.team
tasks:
- id: dbt
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-example
branch: main
- id: dbt_build
type: io.kestra.plugin.dbt.cli.DbtCLI
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
commands:
- dbt deps
- dbt build
profiles: |
my_dbt_project:
outputs:
dev:
type: duckdb
path: ":memory:"
fixed_retries: 1
threads: 16
timeout_seconds: 300
target: dev
- id: upload
type: io.kestra.plugin.core.namespace.UploadFiles
filesMap:
manifest.json: '{{ outputs.dbt_build["outputFiles"]["manifest.json"]}}'
run_result.json: '{{ outputs.dbt_build["outputFiles"]["run_results.json"] }}'
namespace: "{{ flow.namespace }}"
@brian-mulier-p here is an even simpler reproducer (so it's not about dbt or workingDir):
id: dbt_duckdb_repro
namespace: company.team
tasks:
- id: dbt_build
type: io.kestra.plugin.scripts.shell.Commands
commands:
- echo "Test" > test.txt
outputFiles:
- test.txt
- id: upload
type: io.kestra.plugin.core.namespace.UploadFiles
filesMap:
test.txt: '{{ outputs.dbt_build["outputFiles"]["test.txt"]}}'
namespace: "{{ flow.namespace }}"
I know it's been a long time but I've finally found the fix @MarthaScheffler :partying_face: Will be part of Kestra v0.19.3 bugfix release releasing this Tuesday (or optionally here if you can't wait and can add a plugin to your instance manually :P)
FYI I struggled to reproduce it because that was an edge case where putting data from another existing storage file would lead to empty file and as all my tests were done with statically filled inputs I wasn't reproducing the issue. Luckily @Ben8t went into this case :1st_place_medal:
Thank you! will try this out soon!
Describe the issue
When using the io.kestra.plugin.core.namespace.UploadFiles plugin to upload the output of a task as namespace file, the uploaded files are empty. They are both, empty in the UI, as well as on S3 itself.
what else got tested? creating a file in the UI (with content) and saving it works. using different storage (e.g. local Docker or Minio) works. see Slack thread https://kestra-io.slack.com/archives/C03FQKXRK3K/p1724405859008609
reproduction of this issue is non-trivial, because the kestra UI loads the file content apparently from the browser history, so even a hard-refresh of the page doesn't display the current file content.
setup: kestra OSS v0.17 & v0.18 @ kubernetes with external Postgres and external S3 (as described here: https://kestra.io/docs/installation/aws-ec2#step-5-use-aws-s3-for-storage)
Flow: https://kestra.io/plugins/core/tasks/namespace/io.kestra.plugin.core.namespace.uploadfiles#examples (Upload files generated by a previous task)
Environment