orbstack / orbstack

Fast, light, simple Docker containers & Linux machines
https://orbstack.dev
MIT License
5.54k stars 43 forks source link

v17 - Possible networking issue #576

Closed michaelaguiar closed 1 year ago

michaelaguiar commented 1 year ago

Describe the bug

When I updated to version 17 of Orbstack, macOS was no longer able to communicate with my dnsmasq container (5354 -> 53). Currently works on version 16.

Are there any known issues with container to host communication? Could this have something to do with the security improvements?

To Reproduce

No response

Expected behavior

No response

Diagnostic report (required)

No response

Screenshots and additional context (optional)

No response

kekexiaoai commented 1 year ago

https://cdn-updates.orbstack.dev/arm64/OrbStack_v0.16.1_15815_arm64.dmg

kdrag0n commented 1 year ago

Not that I know of. Can you share a reproducer? How are you running the server, and how are you connecting to it from Mac? Port forward, domain, or IP?

ylor commented 1 year ago

I have also lost connectivity to containers that sit behind a wireguard container in v0.17. I use port forwarding to access them. On an M2 Mac mini, if it's relevant.

Downgrading to v0.16.1 resolved the issue so it's definitely a regression/bug with v0.17

gordalina commented 1 year ago

Similar? networking issue here.

I wasn't able to connect to a webserver running in a container via Chrome, but was able via curl. Reverting to v16 resolved the issue.

kdrag0n commented 1 year ago

It'd be really helpful if someone could share a reproducer, plus a diagnostic report from Help -> Collect Diagnostics and maybe the details of the failed request in the Chrome devtools. I haven't been able to reproduce this so far. Tried CoreDNS, Next.js, and Vite servers via port forwards, domains, and IPs.

andrea-sdl commented 1 year ago

@kdrag0n I don't know if this is related but one thing I noticed in v17 is that the container IP ranges weren't honored anymore . I have set them up to work on a different subnet but after v17 they went back to 192.168....

Reverting to v16 fixes it.

gordalina commented 1 year ago

@kdrag0n I don't know if this is related but one thing I noticed in v17 is that the container IP ranges weren't honored anymore . I have set them up to work on a different subnet but after v17 they went back to 192.168....

I have a docker compose file with two services, both of them using the following network:

networks:
  inet6:
    enable_ipv6: true
    ipam:
      config:
        - subnet: 2001:db8:a::/64
          gateway: 2001:db8:a::1
kdrag0n commented 1 year ago

@andrea-sdl Where did the IPs change? The macOS or Linux side? How did you configure it? bip and default-address-pools handling should not be any different now.

andrea-sdl commented 1 year ago

@kdrag0n I did configure it from the orbstack MacOS interface. Docker tab. I have ipv6 disabled and this config in the "Advanced engine config"

{
  "default-address-pools" : [
    {
      "base" : "172.10.0.1/16",
      "size" : 24
    }
  ],
  "bip" : "172.10.0.1/24"
}

I noticed it because I have a log that prints the caller IP (in my case my local Insomnia client to call the API). Before this v17 it was printing 172... with v17 it's printing another IP range.

michaelaguiar commented 1 year ago

@kdrag0n Sorry just got back to my desk.

Here is an example I just setup:

docker run --rm -p 5354:53/udp -e COMMANDS="--address=/test/127.0.0.1 --server=1.1.1.1 --server=8.8.8.8 --log-facility=-" ghcr.io/aliasproject/dns-test

This will create a DNS server (DNSMasq) container and forward all .test domains to its local IP. If you start this container, then add the following to /etc/resolver/test, you should be able to ping testing.test or any .test domain and get a response, on v16.

/etc/resolver/test

port 5354
nameserver 127.0.0.1

On v17, for some reason it's not able to communicate with the container and I'm not sure I understand why at the moment. Very curious what the issue could be though!

-- I love OrbStack and think you are doing a great job btw!

dangh commented 1 year ago

Maybe related to this issue. I'm running OpenVPN and tinyproxy inside a container to connect to our work server via a HTTP proxy. In v16 I can use the proxy server as http://localhost:8888, but on v17 I have to use http://myproxy.orb.local:8888.

Weird thing is Chrome still can connect to http://localhost:8888 but Firefox and Nodejs can't.

kdrag0n commented 1 year ago

I see the problem: networking works, but some combinations of IPv4 and IPv6 clients, servers, and settings can cause the source IP to be incorrect. If your service cares about the source IP, then this can cause issues if your network setup is affected. This is also why some clients are affected and some aren't. Thanks for the reproducer @michaelaguiar!

Fixed for the next version. Everyone, please confirm that it's fixed in this experimental build, in case there are multiple bugs in this thread: https://cdn-updates.orbstack.dev/exp/OrbStack_v0.17.0-21-g0cdc992ff_15967_arm64.dmg

(Apple Silicon only)

michaelaguiar commented 1 year ago

@kdrag0n My issue is resolved! My DNS container is now working properly as it did on v16.x.

Thanks for the quick solution!

kdrag0n commented 1 year ago

Fix released in v0.17.1.

gordalina commented 1 year ago

@kdrag0n my issue wasn't fixed. Chrome did not have any useful info, it only shows the HTTP request, not a response. I did configure docker to have ipv6 (in orbstack's settings)

kdrag0n commented 1 year ago

@gordalina Please share a reproducer if possible. That will make this go much faster.

This works fine for me, on both localhost and ng.project.orb.local:

services:
  ng:
    image: nginx
    networks:
      - inet6
    ports:
      - 80:80

networks:
  inet6:
    enable_ipv6: true
    ipam:
      config:
        - subnet: 2001:db8:a::/64
          gateway: 2001:db8:a::1

Also, the full output of both curl -v -4 localhost and curl -v -6 localhost would be helpful.

boywijnmaalen commented 1 year ago

same for me, I can no longer connect to port 53 from the host to my DNS container, all indications are that connecting is not the problem, the port is definitely open, but no DNS request seem to arrive on the DNS container.

the same DNS queries are however resolvable from within the docker network (so container A is able to resolve domains using the same DNS container)

earlier versions of OrbStack also had this issue, for a time it was resolved, issue seems to be back. back then, I created a very similar ticket to the thing I'm seeing now.

edit: using 17.1

https://github.com/orbstack/orbstack/issues/188

kdrag0n commented 1 year ago

I think everything should be fixed now. Please try this build and let me know regardless of the result: https://cdn-updates.orbstack.dev/exp/OrbStack_v0.17.1-23-gd03a7fb2f_15992_arm64.dmg

(Apple Silicon only)

If it's fixed, it'd still be great if someone could share a way to reproduce the issue (other than the one that was fixed in v0.17.1) so that it can serve as a regression test in the future. I'm still not able to reproduce most of these issues. Thanks!

calebcoverdale commented 1 year ago

I ran the uploaded build and the issue appears to persist.

I am running Homarr on my M2 MacBook Air.

http://localhost:7575 works in Safari, however http://homarr.homarr.orb.local:7575 andhttp://homarr.homarr.orb.local do not.

I loaded the pages in Edge, and all three worked.

I tried doing an inspect in Safari for network and nothing shows up, the page eventually times out and outputs:

Safari Can't Open the Page
Safari can't open the page "homarr.homarr.orb.local" because the server where this
page is located isn't responding.

Safari Version 17.0 (19616.1.27.111.16) EdgeVersion 116.0.1938.69 (Official build) (arm64)

kdrag0n commented 1 year ago

@calebcoverdale Are you sure you're on the v0.17.1 (15992) build? Can you share a diagnostic report from Help -> Collect Diagnostics?

That build works fine for me in Safari:

Screenshot

calebcoverdale commented 1 year ago

OrbStack info: Version: 0.17.1 Commit: d03a7fb2f8d3b44dcc832e9d68a2fafb9eee7567 (v0.17.1-23-gd03a7fb2f-dirty)

System info: macOS: 14.0 (23A5337a) CPU: arm64, 8 cores CPU model: Apple M2

Emailed you the full report (from your GitHub profile)

calebcoverdale commented 1 year ago

Private browsing, different user profile. Same result. Disabled all extensions, issue persists.

Ping:

ping homarr.homarr.orb.local
PING homarr.homarr.orb.local (192.168.228.2): 56 data bytes
64 bytes from 192.168.228.2: icmp_seq=0 ttl=63 time=1.430 ms
64 bytes from 192.168.228.2: icmp_seq=1 ttl=63 time=1.475 ms
64 bytes from 192.168.228.2: icmp_seq=2 ttl=63 time=1.011 ms
^C
--- homarr.homarr.orb.local ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 1.011/1.305/1.475/0.209 ms

So it's resolving the name. I was thinking maybe I was messing up DNS, but I guess Edge working cancels out that logic..

This was my first full work day that I wasn't cursing out Docker for crashing, or having strangeness in builds. Thank you!

boywijnmaalen commented 1 year ago

I must be somehow losing my mind, but now I cannot not even telnet anymore to my local DNS container (tried both 0.17.1 and the experimental build shared by @kdrag0n.

telnet 127.0.0.1 53 no longer seems to work

the DNS config supplied is a simplified version that worked until release 0.17.0. (I run my own dev.local zone - zone not included in the simplified DNS config)

docker-compose.yml

version: '3.8'

services:
   dns-test:
      image: ubuntu/bind9:9.18-22.04_edge
      container_name: dns-test
      restart: unless-stopped
      volumes:
         - ./dns/named.conf:/etc/bind/named.conf:rw
         - ./dns/named.conf.options:/etc/bind/named.conf.options:rw
      ports:
         - "53:53/udp"
         - "53:53/tcp"
      networks:
         app-network:
            ipv4_address: 172.16.1.2

   container-test :
      image: alpine:latest
      container_name: container-test
      command: tail -f /dev/null
      restart: unless-stopped
      dns:
         - 172.16.1.2
      networks:
         app-network:
            ipv4_address: 172.16.1.3

networks:
   app-network:
      driver: bridge
      enable_ipv6: false
      ipam:
         driver: default
         config:
            - subnet: 172.16.1.0/24
              gateway: 172.16.1.1

./dns/named.conf

acl host {
    172.16.1.1;
};

view "host" {
    match-clients { host; };
};

include "/etc/bind/named.conf.options";

./dns/named.conf.options

options {
    directory "/var/cache/bind";

    listen-on port 53 {
        host;
    };

    allow-recursion {
        host;
    };

    allow-transfer {
        host;
    };

    allow-update {
        host;
    };

    allow-query {
        host;
    };

    forwarders {
        1.1.1.1; # Cloudflare
        8.8.8.8; # Google
    };

    recursion yes;
    auth-nxdomain no; # conform RFC1035
    querylog yes;
    version "not available"; # disable for security
    dnssec-validation yes;
};
kdrag0n commented 1 year ago

@boywijnmaalen Thanks for that. Can you double check the config? nc 127.0.0.1 53 doesn't work on v0.16.1 either because the dns-test container isn't listening on TCP port 53.

kdrag0n commented 1 year ago

@calebcoverdale The issue you're having with Safari is a bug in macOS 14 beta. It seems to be impossible to work around, so I'll report it to Apple and wait for them to fix it.

For now, the only workaround is to use a browser other than Safari.

calebcoverdale commented 1 year ago

Thanks for letting me know!

Let me know if there’s anything I can submit on my end to help out.

athurg commented 1 year ago

Is there any hotfix image for x86_64? I can just have a test about it.

kdrag0n commented 1 year ago

I have a docker-compose project with multiple docker containers hosted with Orb. I am trying to access one of the containers from the project from another docker container which is running outside docker-compose project.

Let's say NATS server is running at address nats.project.orb.local 4224. The NATS server is accessible from local Mac, but is not accessible from another Docker container. Any idea why?

From local Mac shell:

ping nats.project.orb.local
PING nats.project.orb.local (10.200.0.2): 56 data bytes
64 bytes from 10.200.0.2: icmp_seq=0 ttl=63 time=1.171 ms****

From another Docker container outside the project running in Orb stack:

/app $ ping nats.project.orb.local
PING nats.project.orb.local (10.200.0.2): 56 data bytes
^C

telnet on port 4224 (TCP) is not working either.

@robertmircea I think this is the same issue as what @boywijnmaalen is describing here.

The Docker engine intentionally isolates bridge networks and prevents communication between different container bridges, so you will not be able to communicate between services unless they're part of the same Compose project or are both using the default network (i.e. not Compose).

I'll consider changing this behavior, but it's not a bug so I don't think there are still any new network bugs/regressions in this issue. If anyone still has networking issues on v0.17.1, please open a new issue for your specific case.

gordalina commented 1 year ago

@kdrag0n I wasn't listening on the ipv6 port, only ipv4. This worked in 0.16 but not 0.17. Its working now if I listen on an ipv6 address.

kdrag0n commented 1 year ago

@gordalina Thanks for adding that detail. Are you listening on ::, 0.0.0.0, or a specific IPv4/v6 address? A reproducer would be really helpful if possible.

gordalina commented 1 year ago

@kdrag0n moved my issue to #627 which includes a reproducer.

GuneApp commented 1 year ago

Hi,

The issue is not fully solved for me.

My environment:

Macbook M1 Pro MacOS Ventura 13.3.1 Safari 16.4 Chrome 116.0.5845.110

Container: a very simple PHP 7.4 container created with docker-compose, with custom network

version: "3.4"
name: "dev"
services:
  php7:
    container_name: php7
    image: php:7.4-cli
    working_dir: <HIDDEN>
    entrypoint: /bin/sh
    ports:
      - 8001:8001
      - 8002:8002
      - 8003:8003
      - 8004:8004
    networks:
      - dev-network
    volumes:
      - <HIDDEN>
    tty: true
    extra_hosts:
        - "host.docker.internal:host-gateway"
networks:
  dev-network:
    name: dev-network
    external: true

Started the server an run a very simple web api with internal php server in port 8001.

Results

Orbstack 0.16.1 (15815_arm64)

Orbstack 0.17.1 (Aug 31) - The last version updating from inside the app

Maybe Safari it's using ipv6, I don't know, but networking doesn't seem to work as in previous version.

Thanks!

athurg commented 1 year ago

Hi,

The issue is not fully solved for me.

My environment:

Macbook M1 Pro MacOS Ventura 13.3.1 Safari 16.4 Chrome 116.0.5845.110

Container: a very simple PHP 7.4 container created with docker-compose, with custom network

version: "3.4"
name: "dev"
services:
  php7:
    container_name: php7
    image: php:7.4-cli
    working_dir: <HIDDEN>
    entrypoint: /bin/sh
    ports:
      - 8001:8001
      - 8002:8002
      - 8003:8003
      - 8004:8004
    networks:
      - dev-network
    volumes:
      - <HIDDEN>
    tty: true
    extra_hosts:
        - "host.docker.internal:host-gateway"
networks:
  dev-network:
    name: dev-network
    external: true

Started the server an run a very simple web api with internal php server in port 8001.

Results

Orbstack 0.16.1 (15815_arm64)

Orbstack 0.17.1 (Aug 31) - The last version updating from inside the app

Maybe Safari it's using ipv6, I don't know, but networking doesn't seem to work as in previous version.

Thanks!

So did I, in my iMac, which running macOS Ventura (13.5.1) with x86_64, and OrbStack Version 0.17.1 (15969).

Steps to reproduce:

  1. Running a container with docker run --rm --name nginx nginx:alpine
  2. Test with ping nginx.orb.local or curl -4 nginx.orb.local (Force the curl to use IPv4), both of them works well.
  3. Test with ping6 nginx.orb.local or curl -6 nginx.orb.local (Force the curl to use IPv6), both of them are failed.
  4. Open http://nginx.orb.local with Chrome and Safari, Chrome works fine but Safari failed.
feranwq commented 1 year ago

same err in x86 macos 12.6.1, orbstack 0.17.1, only ipv4

version: '3.8'
services:
  dnsmasq:
    container_name: zdnsmasq
    image: jpillora/dnsmasq
    restart: always
    ports:
      - "53:53/udp"
      - "5380:8080"
    volumes:
      - ./dnsmasq.conf:/etc/dnsmasq.conf
    environment:
      - HTTP_USER=admin
      - HTTP_PASS=admin
    logging:
      driver: "json-file"
      options:
        max-size: "100m"
dig +time=1 google.com @127.0.0.1

; <<>> DiG 9.10.6 <<>> +time=1 google.com @127.0.0.1
;; global options: +cmd
;; connection timed out; no servers could be reached

dig +time=1 google.com @dnsmasq.dns.orb.local

; <<>> DiG 9.10.6 <<>> +time=1 google.com @dnsmasq.dns.orb.local
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45131
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;google.com.            IN  A

;; ANSWER SECTION:
google.com.     202 IN  A   172.217.25.14

;; Query time: 51 msec
;; SERVER: 192.168.227.2#53(192.168.227.2)
;; WHEN: Fri Sep 08 16:30:42 CST 2023
;; MSG SIZE  rcvd: 55
kdrag0n commented 1 year ago

@GuneApp What does your custom network look like? docker inspect dev-network

Also please share a diagnostic report from Help -> Collect Diagnostics.

I can't reproduce it with docker network create dev-network.

kdrag0n commented 1 year ago

@athurg I'm not able to reproduce it with your commands either. Please share a diagnostic report from Help -> Collect Diagnostics.

kdrag0n commented 1 year ago

@feranwq Unfortunately your reproducer isn't triggering the issue either; both dig +time=1 google.com @127.0.0.1 and dig +time=1 google.com @dnsmasq.dns.orb.local are working fine for me. Please share a diagnostic report from Help -> Collect Diagnostics.

Note that I removed the volume mount and used the default dnsmasq config instead. A copy of your reproducer's dnsmasq.conf would be helpful.

kdrag0n commented 1 year ago

Sorry everyone, turns out I couldn't reproduce any of this because it was already fixed in my development build. I misremembered the fix as being included in v0.17.1 but it's currently only in experimental builds:

This is already fixed for the next version.

athurg commented 1 year ago

@kdrag0n Which version will include this patch? I have tried v0.17.2, this issue haven't been fixed.

UPDATE:

I've tried the latest v0.17.3, and this bug have been fixed in this version.

boywijnmaalen commented 1 year ago

@kdrag0n

I can also comfirm v0.17.3 fixes my issue (issue where DNS resolving from the host to DNS container was no longer working)

GuneApp commented 1 year ago

I can also comfirm that v0.17.3 fixes my issue. Thank you!!

BxtGeek commented 1 month ago

It seems this issue is more browser-related. I’m experiencing the same problem, and it’s particularly common in the Brave browser. Whenever I try to open the pod IP, I get an "ERR_ADDRESS_UNREACHABLE" error, as if the browser is unable to resolve or connect to the Kubernetes network. I also attempted to capture a HAR file but didn’t find anything useful. I even tried resetting the browser settings, but that didn’t help. However, when I open the same address in Safari, it works without any problems.

Can the team suggest what else I should check? If there are additional logs to capture, I’d be happy to gather them and help troubleshoot the root cause of this issue. Screenshot 2024-09-18 at 6 22 32 AM Screenshot 2024-09-18 at 6 22 42 AM

kdrag0n commented 1 month ago

@BxtGeek That's probably caused by #1452.