kestra-io / kestra

Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
https://kestra.io
Apache License 2.0
7.47k stars 445 forks source link

[Tenant-related issue] Execution don't show logs anymore (logs, gantt view) on multi-tenant EE instance #3623

Closed Ben8t closed 4 months ago

Ben8t commented 4 months ago

Expected Behavior

No response

Actual Behaviour

When looking back at an execution, logs don't show up 🤔

They are displayed only during the execution though

image image

Steps To Reproduce

No response

Environment Information

Example flow

No response

Ben8t commented 4 months ago

Flow can be find on the demo shiny_rocks.dbt_run

tasks:
  - id: workingdir
    type: io.kestra.core.tasks.flows.WorkingDirectory
    tasks:
      - id: cloneRepository
        type: io.kestra.plugin.git.Clone
        url: https://github.com/kestra-io/shiny_rocks.git
        branch: main

      - id: dbt
        type: io.kestra.plugin.dbt.cli.DbtCLI
        parseRunResults: true
        docker:
          image: ghcr.io/kestra-io/dbt-bigquery:latest
        inputFiles:
          profiles.yml: |
            shiny_rocks_dbt:
              outputs:
                dev:
                  type: bigquery
                  dataset: shiny_rocks
                  fixed_retries: 1
                  keyfile: service_account.json
                  location: EU
                  method: service-account
                  priority: interactive
                  project: kestra-dev
                  threads: 8
                  timeout_seconds: 300
              target: dev
          service_account.json: "{{ secret('GCP_CREDS') }}"
        commands:
          - dbt run --profiles-dir=. --project-dir=shiny_rocks_dbt
loicmathieu commented 4 months ago

Logs are present and displayed in the Logs page but not the Execution Logs tab. It cannot be reproduced locally.

anna-geller commented 4 months ago

I also couldn't reproduce for logs

image

@Ben8t please reopen if you see the issue repeating

Ben8t commented 4 months ago

@anna-geller it stills occurs on the cloud version (see screen recording attached)

For the logs tab it seems to be only an issue on flow with dbt 🤔 For gantt view: I can't get any logs from any past execution

Maybe a role-access issue?

https://github.com/kestra-io/plugin-dbt/assets/46634684/c1e98ab7-ddf4-48b8-bb99-e443cff0dbf2

anna-geller commented 4 months ago

yup, this seems urgent

anna-geller commented 4 months ago

the same happens on 0.17.0 develop) - log show up during execution but as soon as the Execution is done, logs no longer show up https://share.descript.com/view/uV95yxLNBGf

tested with this flow:

id: dwh_and_analytics
namespace: shiny_rocks

tasks:
  - id: dbt
    type: io.kestra.core.tasks.flows.WorkingDirectory
    tasks:
    - id: clone_repository
      type: io.kestra.plugin.git.Clone
      url: https://github.com/kestra-io/dbt-demo
      branch: main

    - id: dbt_build
      type: io.kestra.plugin.dbt.cli.DbtCLI
      runner: DOCKER
      docker:
        image: ghcr.io/kestra-io/dbt-duckdb:latest
      commands:
        - dbt deps
        - dbt build
      profiles: |
        jaffle_shop:
          outputs:
            dev:
              type: duckdb
              path: dbt.duckdb
              extensions: 
                - parquet
              fixed_retries: 1
              threads: 16
              timeout_seconds: 300
          target: dev      

    - id: python
      type: io.kestra.plugin.scripts.python.Script
      outputFiles:
        - "*.csv"
      docker:
        image: ghcr.io/kestra-io/duckdb:latest
      script: |
        import duckdb
        import pandas as pd

        conn = duckdb.connect(database='dbt.duckdb', read_only=False)

        tables_query = "SELECT table_name FROM information_schema.tables WHERE table_schema = 'main';"
        tables = conn.execute(tables_query).fetchall()

        # Export each table to CSV, excluding tables that start with 'raw' or 'stg'
        for table_name in tables:
            table_name = table_name[0]
            # Skip tables with names starting with 'raw' or 'stg'
            if not table_name.startswith('raw') and not table_name.startswith('stg'):
                query = f"SELECT * FROM {table_name}"
                df = conn.execute(query).fetchdf()
                df.to_csv(f"{table_name}.csv", index=False)

        conn.close()
tchiotludo commented 4 months ago

After investigation, every task inside a WorkingDirectory will loose the tenantId, preventing displaying the log