Closed Michael-J-Ward closed 2 months ago
Rebased off main and applied the appropriate rustfmt
to the commits.
Multi-daemon example breaks on Windows but not on Linux.
2024-04-16T16:10:17.406865Z WARN dora_daemon::node_communication: failed to send event to daemon
failed to run dora-daemon: failed to forward output to remote receivers: failed to connect to machine `B`: failed to connect: The requested address is not valid in its context. (os error 10049)
...
2024-04-16T16:10:17.412084Z ERROR dora_coordinator::listener: Os { code: 10054, kind: ConnectionReset, message: "An existing connection was forcibly closed by the remote host." }
at binaries\coordinator\src\listener.rs:33
@phil-opp did you update those logs in an attempt to debug?
Ahh - 0.0.0.0
is not connectable by clients on windows, whereas linux
routes it to your localhost.
Fixing.
Daemon::run
uses its configuable address to construct register its listen_socket
with the coordinator in coordinator_messages::CoordinatorRequest::Register
.
If a window's users configures that to 127.0.0.1
or <vpn-ip>
, that works fine.
But if a Window's users configures it to 0.0.0.0
, then the daemon listen_socket
can't actually be connected to as an InterDaemonConnection
.
@phil-opp did you update those logs in an attempt to debug?
Yes, but I think it's also an improvement that we might want to keep.
Daemon::run
uses its configuable address to construct register itslisten_socket
with the coordinator incoordinator_messages::CoordinatorRequest::Register
.If a window's users configures that to
127.0.0.1
or<vpn-ip>
, that works fine.But if a Window's users configures it to
0.0.0.0
, then the daemonlisten_socket
can't actually be connected to as anInterDaemonConnection
.
Thanks a lot for debugging this! This is indeed the issue. However, I think the correct fix is to use the public IP instead of the local bind address for the registration. One way to get the public IP of the daemon could be to take the connection.peer_addr().ip()
of the incoming connection. The Register
message then only needs to tell us the (public) port number, on which the daemon listens for incoming inter-daemon connections.
I opened a PR against your PR to implement this change: https://github.com/Michael-J-Ward/dora/pull/1. I also tested this change on our CI in https://github.com/dora-rs/dora/pull/473.
Ready to go.
One follow-on item raised by @haixuanTao: a proper "distributed-test" example where the daemon's run separately.
Changelog
0.0.0.0:0
0.0.0.0:DORA_COORDINATOR_PORT_DEFAULT
The CLI provides the defaults, so this shouldn't break any existing run commands.
Notes
~
cargo fmt --all
makes a lot of unrelated changes. Do you have arustfmt
config somewhere that I can use? Similarly,cargo clippy
gives a fair number of lints.~~If you'd like, let me know if you want me to include those - either in addition to this or as a separate PR.~
EDIT: I realized my own
rustfmt
was running off a custom config.Ref #459