dora-rs / dora

DORA (Dataflow-Oriented Robotic Application) is middleware designed to streamline and simplify the creation of AI-based robotic applications. It offers low latency, composable, and distributed dataflow capabilities. Applications are modeled as directed graphs, also referred to as pipelines.
https://dora-rs.ai
Apache License 2.0
1.36k stars 69 forks source link

MacOS/Windows bug with Daemon/Coordinator on 0.3.5rc0 #575

Open Hennzau opened 1 week ago

Hennzau commented 1 week ago

Describe the bug

When I launch a dataflow where a node is dynamic, I can't stop it; no crash of coordinator or daemon (CTRL+C if attached, dora stop if detached).

To Reproduce Steps to reproduce the behavior:

  1. Dora start daemon: dora up
  2. Start a new dataflow: dora start dataflow.yaml (optional: --detach)
  3. Stop dataflow: dora stop / CTRL+C At this step it blocks until I do:
  4. Destroy dataflow: dora destroy

On windows, i can only tell that when i want to do dora stop it blocks. Because CTRL+C in attached mode always make the coordinator crash

Environments (please complete the following information):

Hennzau commented 1 week ago

on windows I have this error in dora-coordinator output file:

2024-07-01T19:56:46.554799Z ERROR dora_coordinator::control: failed to send reply
2024-07-01T19:56:46.554844Z ERROR dora_coordinator::control: failed to send reply
haixuanTao commented 1 week ago

I'm unable to reproduce. I get the following behaviour:

https://github.com/dora-rs/dora/assets/22787340/c2a080dd-0295-4a5c-babb-90e7dfc22c90

Are you saying that the dynamic node is not stopping? Or that the dataflow is not stopping?

P.S: Sorry the screen recording only recorded half of the screen.

Hennzau commented 1 week ago

Hi, it's the dataflow that is not stopping. Even if I don't run the dynamic node, i'm unable to stop the dataflow. Here is a record:

https://github.com/dora-rs/dora/assets/72349109/3f1e67f0-77e2-45bd-9222-1c56a88e0a9a

haixuanTao commented 1 week ago

So I think that you might need to start the dynamic node first before stopping the dataflow. I think not starting the dataflow is going to make it impossible to stop a not started dataflow.

I think we should fix this but this might be your error.

phil-opp commented 13 hours ago

I opened https://github.com/dora-rs/dora/pull/583 with a potential fix. Perhaps you could give it a try if you have time, @Hennzau?