Open osgrwade opened 1 month ago
Just as a side note -- we are experiencing this same behavior in ALL of our environments, even production.
(I put this on slack, too). Upped to Python 3.11, swapped out Polars 0.19.5 for Polars-lts-cpu. Still getting a hung dagster-daemon process. I can repeat this over-and-over.
Dagster version
1.7.14
What's the issue?
When I run locally I am running two processes: Dagster-webserver < with various commands > Dagster-daemon run 1> daemon.out 2> daemon2.out
Once my local UI is running I start monitoring processes on my machine (where I have 10 code locations loaded into Dagster) I see that: Dagster-webserver spawns off 10 processes that start like this: “..python -m dagster api graph —lazy-load-user-code —socket < a temp file path > —heartbeat —heartbeat-timeout 45 …. “ Dagster-daemon spawns off 10 processes that start like this: “..python -m dagster api graph —lazy-load-user-code —socket < a temp file path > —heartbeat —heartbeat-timeout 20 …. “
At some point (I have seen this happen at 18 minutes and also at 3 minutes) the dagster-daemon process stops spawning off processes.
The contents of the daemen.out and daemon2.out files are: Daemon.out < timestamp > dagster.daemon …. .Instance is configured with … ..’SensorDaemon’] Daemon2.out < nothing appears in the file >
What did you expect to happen?
dagster-daemon to continue spawning processes normally
How to reproduce?
When I run with one code location I do not see any problems. When I run anywhere from 5-10 code locations I see the behavior described above.
Deployment type
Local
Deployment details
We are seeing this in all our environments, but I am able to reproduce this when I run locally as well.
Additional information
No response
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.