kestra-io / kestra

:zap: Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
https://kestra.io
Apache License 2.0
11.69k stars 997 forks source link

secret() function seems inconsistent with sourcing secrets #5661

Open pypeaday opened 1 week ago

pypeaday commented 1 week ago

Describe the issue

I can sucesfully run 1 flow but another very similar one fails. It looks to me to be an issue with passing a secret down because if I hardcode the values in my flows/scripts then they work fine.

Successful flow

I can create a bucket just fine on my minio instance

I have a .env_encoded like the docs say, it looks like this

SECRET_AWS_ACCESS_KEY_ID=base64encoded_key
SECRET_AWS_SECRET_ACCESS_KEY=base64encoded_access_key
SECRET_AWS_ENDPOINT_URL_S3=base64encoded_minio_url

I docker compose up -d and make a flow

---
id: s3_compatible_bucket
namespace: homelab.dev
tasks:
- id: create_bucket
  type: io.kestra.plugin.minio.CreateBucket
  accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
  secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}"
  endpoint: "{{ secret('AWS_ENDPOINT_URL_S3') }}"
  bucket: "kestra-test-bucket"

this runs great

Failure

In the same encoded env file I have

SECRET_OPENAI_API_KEY=base64encoded_apikey

Then the flow that failes

id: openai
namespace: homelab.dev
tasks:
  - id: prompt
    type: io.kestra.plugin.openai.ChatCompletion
    # this works if I hardcode... idk what's wrong with the secrets... they work in the create bucket flow
    apiKey: "{{ secret('OPENAI_API_KEY') }}"

    model: gpt-4
    prompt: Explain in one sentence why data engineers build data pipelines
  - id: use_output
    type: io.kestra.plugin.core.log.Log
    message: "{{ outputs.prompt.choices | jq('.[].message.content') | first }}"

When I run this the error is that I didn't supply an apiKey, but if I hardcode it in the flow then it works

Environment

anna-geller commented 1 week ago

It must be an issue in the OpenAI plugin, we'll investigate, thx for the report! 👍 if you see it with any other plugin, keep us posted. we're working on making all properties dynamic (without losing strong typing) in the next release so this should get better soon

pypeaday commented 1 week ago

Well I actually do have a couple more very similar issues that I'll outline here for documentation's sake

Nextcloud

Testing simple nextcloud integration to download (and then eventually also upload) a file I was not able to pass my credentials to my python script

id: nextcloudTest
namespace: homelab.dev
tasks:
- id: debugging
  type: io.kestra.plugin.core.log.Log
  message: My url "{{ secret('NC_URL') }}"
- id: python
  type: io.kestra.plugin.scripts.shell.Commands
  containerImage: python:3.11-slim
  namespaceFiles:
    enabled: true
    include:
      - nextcloud_test.py
  beforeCommands:
  - python -m pip install webdavclient3
  commands:
  - env
  - python nextcloud_test.py
  env:
    NC_URL: "{{ secret('NC_URL') }}"
    NC_PASSWORD: "{{ secret('NC_PASSWORD') }}"
    NC_USER: "{{ secret('NC_USER') }}"

And in my .env_encoded

SECRET_NC_PASSWORD=base64encoded_password
SECRET_NC_URL=base64encoded_url
SECRET_NC_USER=base64encoded_user

I can see that these are set but masked in the logs - and I also tried to print them from the pyton script but they were ***masked*** so I'm not sure how to confirm if it's working or not. But the same thing holds true that if I just hardcode those credentials in my python script then it works fine The script is trivial and I doubt is related to the problem but it looks like this

import os
from pathlib import Path

from webdav3.client import Client

# requirements
# ---
# webdavclient3

PASSWORD = os.environ.get("NC_PASSWORD")
URL = os.environ.get("NC_URL")
USER = os.environ.get("NC_USER")

options = {
    "webdav_hostname": URL,
    "webdav_login": USER,
    "webdav_password": PASSWORD,
}
print(options)
client = Client(options)
client.verify = True
file: Path = Path("Readme.md")
client.download_file(remote_path=f"/{file}", local_path=str(file))
# client.upload_file(remote_path=f"/updloaded-{file.name}", local_path=str(file))

In this case the python error is webdav3.exceptions.ConnectionException: No connection adapters were found for '**masked***************************************************/Readme.md' However that "No connection adapters" has means before that the protocol in the URL was missing, but again if I hardcode the value then it works fine so it makes me think there's an issue with the secret() function in kestra's templating

pypeaday commented 1 week ago

I also had issues with a terraform workflow but I kind of stupidely removed the docker volume and didn't back that workflow up so I don't have it or the log immediately handy... the issue was with setting up an s3 provider for remote state storage. I wanted to use my minio instance and that's compatible with the aws cli/boto sdk with passing the AWS secrets, just like in the first post, into the environment - then aws <commands> can work with local minio, it's great.

But I could never get the terraform flow to get through terraform init because it woudl fail to setup the s3 provider. However, when I run the same terraform code from my desktop it works fine - so again it feels to me like an issue with the secret() handling of the variables.

If a flow and logs would be really valuable I can try to recreate this in the next few days

smunteankestra commented 2 days ago

Hi @pypeaday, I replicated a similar flow on my end, and the secrets setup works as expected. Based on @anna-geller's suggestion, the issue likely stems from specific plugin configurations or compatibility within the environment.

anna-geller commented 2 days ago

@pypeaday given that we can't reproduce, can you try the same on the develop Kestra docker image? for local development, you can also try putting the secrets in the KV Store and see if then it works https://kestra.io/docs/concepts/kv-store

btw why use io.kestra.plugin.scripts.shell.Commands instead of io.kestra.plugin.scripts.python.Commands intended for Python? 😄

pypeaday commented 2 days ago

Ok so this morning I pulled :develop tag, tested with kv instead of secret and it the nextcloud one worked fine, but still broken with secret

As for the shell vs python commands plugin - that's just ignorance on my part.... I was getting conflicting messages between vscode and the web-based editor but I also can't replicate that immediately this morning... vscode did recently update, the kestra extension most recent release was the day after I installed it so maybe that update also fixed the linting discrepancy I was facing

pypeaday commented 2 days ago

Oh the openai flow worked with secret and the :develop tag this morning though... I cannot find any real difference in the secret encoded env variables between the AWS or openAI vars and my nextcloud vars...

anna-geller commented 2 days ago

gotcha. yeah, this makes sense as Stefan also couldn't reproduce on develop atm. It might have been a transient issue, keep us posted

pypeaday commented 2 days ago

is there any way I can print out secret values in plain text so I can validate they're encoded and set correctly?

anna-geller commented 2 days ago

try in the Debug Outputs console:

image

we mask logged secrets for security reasons as someone could have unintentionally log secrets e.g. when passing it from env variables

anna-geller commented 2 days ago

btw did you set this in your configuration? if not, this might be an issue for why secret function misbehaves:

kestra:
  encryption:
    secret-key: BASE64_ENCODED_STRING_OF_32_CHARCTERS

more https://kestra.io/docs/configuration#encryption

pypeaday commented 2 days ago

No, I have a .env_encoded that gets brought into the image with compose

image

But I made this from my .env that I use to test my scripts and stuff outside the container, and used the function from kestra's docs here: https://kestra.io/docs/how-to-guides/secrets

anna-geller commented 2 days ago

Gotcha, keep us posted:

  1. If the issue is now resolved on develop, you can close the issue
  2. If you still see the issue, we will need logs