temporalio / cli

Command-line interface for running Temporal Server and interacting with Workflows, Activities, Namespaces, and other parts of Temporal
https://docs.temporal.io/cli
MIT License
242 stars 32 forks source link

[Bug] dev server panics when binding to 0.0.0.0 in docker container #595

Closed cawalch closed 5 days ago

cawalch commented 3 weeks ago

What are you really trying to do?

Bind the temporal dev server on 0.0.0.0

Describe the bug

When attempting to start the temporal dev server within a debian docker container (provided in the repo), a panic occurs when attempting to bind to 0.0.0.0.

This may have been related to https://github.com/temporalio/cli/pull/564 since the panic did not occur in the previous version.

root@e2ca59546f42:/app# ./temporal server start-dev --ip 0.0.0.0
panic: failed assigning ephemeral port: failed to assign a free port: dial tcp [::]:34879: connect: cannot assign requested address

goroutine 1 [running]:
github.com/temporalio/cli/temporalcli/devserver.MustGetFreePort({0xffffc47a6f50?, 0x0?})
        /app/temporalcli/devserver/freeport.go:81 +0x6c
github.com/temporalio/cli/temporalcli.(*TemporalServerStartDevCommand).run(0x400021b008, 0x40000f0460, {0x40008559a8?, 0x0?, 0x0?})
        /app/temporalcli/commands.server.go:103 +0x918
github.com/temporalio/cli/temporalcli.NewTemporalServerStartDevCommand.func1(0x400021b010?, {0x4000608e40?, 0x4?, 0x21ed184?})
        /app/temporalcli/commands.gen.go:1282 +0x3c
github.com/spf13/cobra.(*Command).execute(0x400021b010, {0x4000608e20, 0x2, 0x2})
        /go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:987 +0x828
github.com/spf13/cobra.(*Command).ExecuteC(0x4000213c08)
        /go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115 +0x344
github.com/spf13/cobra.(*Command).Execute(...)
        /go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        /go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1032
github.com/temporalio/cli/temporalcli.Execute({0x2e05768?, 0x400061f4a0?}, {{0x0, 0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, 0x0, ...})
        /app/temporalcli/commands.go:323 +0xec
main.main()
        /app/cmd/temporal/main.go:15 +0x64

Minimal Reproduction

docker run -it docker.io/library/temporal-cli-test server start-dev --ip 0.0.0.0

Environment/Versions

Model Name: MacBook Pro Model Identifier: Mac15,11 Model Number: MRW33LL/A Chip: Apple M3 Max Total Number of Cores: 14 (10 performance and 4 efficiency) Memory: 36 GB System Firmware Version: 10151.101.3 OS Loader Version: 10151.101.3

cretz commented 3 weeks ago

:+1: We had to add that issue since Linux can sometimes reuse ports nowadays, but as part of forcing port to not get reassigned we have to open a socket to it to put it in a certain status. I would have expected this to work given https://pkg.go.dev/net#Dial:

For TCP, UDP and IP networks, if the host is empty or a literal unspecified IP address, as in ":80", "0.0.0.0:80" or "[::]:80" for TCP and UDP, "", "0.0.0.0" or "::" for IP, the local system is assumed.

So I am guessing Go cannot find the "local system" address in this situation. We will investigate...

cawalch commented 3 weeks ago

👍 We had to add that issue since Linux can sometimes reuse ports nowadays, but as part of forcing port to not get reassigned we have to open a socket to it to put it in a certain status. I would have expected this to work given https://pkg.go.dev/net#Dial:

For TCP, UDP and IP networks, if the host is empty or a literal unspecified IP address, as in ":80", "0.0.0.0:80" or "[::]:80" for TCP and UDP, "", "0.0.0.0" or "::" for IP, the local system is assumed.

So I am guessing Go cannot find the "local system" address in this situation. We will investigate...

diff --git a/temporalcli/devserver/freeport.go b/temporalcli/devserver/freeport.go
index 9acca84..b360c57 100644
--- a/temporalcli/devserver/freeport.go
+++ b/temporalcli/devserver/freeport.go
@@ -44,7 +44,11 @@ func GetFreePort(host string) (int, error) {
        // ports, making an extra connection just to reserve a port might actually
        // be harmful (by hastening ephemeral port exhaustion).
        if runtime.GOOS != "darwin" && runtime.GOOS != "windows" {
-               r, err := net.DialTCP("tcp", nil, l.Addr().(*net.TCPAddr))
+               tcpAddr, err := net.ResolveTCPAddr("tcp", fmt.Sprintf("%s:%d", host, port))
+               if err != nil {
+                       return 0, fmt.Errorf("Error resolving address: %e", err)
+               }
+               r, err := net.DialTCP("tcp", nil, tcpAddr)
                if err != nil {
                        return 0, fmt.Errorf("failed to assign a free port: %v", err)
                }

Yeah, I was able to get it working locally with this change (not a ton of testing though).

mjameswh commented 2 weeks ago

Environment/Versions

Model Name: MacBook Pro Model Identifier: Mac15,11 Model Number: MRW33LL/A Chip: Apple M3 Max Total Number of Cores: 14 (10 performance and 4 efficiency) Memory: 36 GB System Firmware Version: 10151.101.3 OS Loader Version: 10151.101.3

@cawalch Can you please confirm you are able to reproduce this issue while running Docker on your mac?

I totally understand the theory behind this error, but for some reason, I'm struggling to reproduce it either on mac or linux, which is kind of essential so that I can validate a fix. If you don't mind sharing some details on your Docker/network setup, that could help me getting there more efficiently… e.g.: is that with Docker Desktop/a Brew installation/something else? Content of your Docker's daemon.json file (that relates to networking/ipv6)? Etc. Just give me anything that you feel could be pertinent. Thanks.

cawalch commented 2 weeks ago

@mjameswh This appears only to be an issue for our devs running Rancher Desktop (MacOS). I wasn't able to reproduce on my personal Mac running Docker Desktop.

Rancher Desktop Versions

Container Engine moby

mjameswh commented 2 weeks ago

Thank you very much, @cawalch. That's very useful.

mjameswh commented 2 weeks ago

I was able to reproduce the issue on a Mac M2, no ipv6 connection, with the following procedure:

  1. Install Rancher Desktop (don't know if this is actually required);
  2. Rancher Desktop Prefs:
    • Required:
    • Enabled Kubernetes: yes
    • The following options are not making a difference:
    • Emulation: QEMU or VZ
    • Kubernetes/Enable Traefik: on or off
    • Don't know
    • User socket-vmnet: not check
    • Container engine: dockerd (moby)

I have not found a way to reproduce with Docker Desktop, even with Kubernetes enabled.