windmill-labs / windmill

Open-source developer platform to power your entire infra and turn scripts into webhooks, workflows and UIs. Fastest workflow engine (13x vs Airflow). Open-source alternative to Retool and Temporal.
https://windmill.dev
Other
10.97k stars 536 forks source link

bug: module 'wmill' has no attribute 'set_state' #3484

Open Girgitt opened 8 months ago

Girgitt commented 8 months ago

Describe the bug

Hi a fresh windmill user here. When I try to run example python script I would say 3 out of 5 times I get AttributeError exception on function calls for wmill.get_state() and wmill.set_state()

In my environment I use docker compose file from windmill project (as decribed in Docker section here: https://www.windmill.dev/docs/advanced/self_host) started with podman-compose with minor changes (fixed container versions and podman-related fixes):

# diff docker-compose-orig.yml docker-compose.yml 
8c8
<     image: postgres:16
---
>     image: docker.io/library/postgres:16
16c16
<       - 5432:5432
---
>       - "5432:5432"
109c109
<     image: ghcr.io/windmill-labs/windmill-lsp:latest
---
>     image: ghcr.io/windmill-labs/windmill-lsp:1.2980.0
118c118
<     image: ghcr.io/windmill-labs/windmill-multiplayer:latest
---
>     image: ghcr.io/windmill-labs/windmill-multiplayer:1.298.0
126c126
<     image: caddy:2.5.2-alpine
---
>     image: docker.io/library/caddy:2.5.2-alpine
135c135
<       - 80:80
---
>       - "8000:80"

environemt details:

# podman-compose -v
podman-compose version: 1.0.6
['podman', '--version', '']
using podman version: 4.2.0
podman-compose version 1.0.6
podman --version 
podman version 4.2.0
exit code: 0
# cat /etc/redhat-release 
Rocky Linux release 8.6 (Green Obsidian)
# uname -a
Linux host.domain 4.18.0-372.9.1.el8.x86_64 #1 SMP Tue May 10 14:48:47 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

To reproduce

  1. start docker compose setup using podman-compose
  2. create example python script (one which prints "Hello World and a warm welcome especially to {name}")

Expected behavior

script executes every time without errors

Screenshots

test run which fails image

next test run that succeeds: image

Browser information

Chrome Version 122.0.6261.111 (Official Build) (64-bit) on Linux Mint 20

Application version

windmill versions (backend) v1.2980.0

Additional Context

Restarting worker and worker_native containers does not change the behavior.

Girgitt commented 8 months ago

When I edited description I might have assigned @rubenfiszel by accident and have no way to revert this action. Sorry for my sloppiness if this was not automation related to handling new issues.

rubenfiszel commented 8 months ago

you should look into the folders starting with /tmp/windmill/cache/pip/wmill== in the workers of the runs that fail to see if you have a faulty wmill that as been cached in there. Also, try to reproduce without podman in the plain docker-compose to see if it's not a podman issue

Girgitt commented 8 months ago

I will test docker. I also checked cache/pip folders but working and failing workers have the same content

containers: windmill_worker_N have proper content in /tmp/windmill/cache/pip/

root@e4f5e53e5dda:/usr/src/app# ls /tmp/windmill/cache/pip/wmill\=\=1.298.0/
wmill  wmill-1.298.0.dist-info

containers: windmill_worker_native_N have empty /tmp/windmill/cache/pip/

runs work ok on worker_1 and always fail on the same workers (worker_2, worker_3)

Is there a command to clear pip cache and re-install dependencies for a script on a specific or all workers ?

Girgitt commented 7 months ago

Original compose config run with docker-compose does not experience this issue.

Girgitt commented 7 months ago

With podman-compose it looks like the first worker (which installs dependencies) is able to execute the script but other fail despite having pip cache available on the shared volume mapped to each worker container.

However after restarting the whole compose setup all workers fail to execute python script.

Only after the pip cache is deleted from the mapped volume on next run of the script first worker that runs it and installs dependencies again is able to execute the script and other workers always fail.