Open daniesso opened 8 months ago
I encountered similar problem:
https://github.com/apache/skywalking-php/actions/runs/7525408993/job/20481714358
docker ps
is well but connection refused.
My CI will test multiple macos at the same time, but this is an occasional problem.
I'm seeing something similar where the published ports of containers (docker run -p ...
) are not reachable:
lsof -i tcp -n
shows nothing listening on the published portcolima ssh
, I can see the published ports exposed in the vm (ss -tlnp
) and I can actually reach the containers.colima restart
seems to solve the issueas @jmjoy is mentioning, this does not happen 100% of the time, but is frequent enough to consistently break CI in my usescase: I'm running 4 MacOS github actions runners and we've consistently had at least 1 failure out of the 4 runs in past month alone
edit: Here's a minimal workflow which can produce the issue on github actions. This will spawn a shell on the runner if the issue is reproduced, printing credentials you need to connect to the runner :
name: github actions testing
on:
push:
workflow_dispatch:
jobs:
testing:
runs-on: macos-latest
steps:
- name: install tmate
run: |
brew install tmate
- name: Use colima as default docker host on MacOS
run: |
brew install docker
colima start
ls -la $HOME/.colima/default/docker.sock
sudo ln -sf $HOME/.colima/default/docker.sock /var/run/docker.sock
ls -la /var/run/docker.sock
- name: run container
run: |
docker pull python:3.12-slim
docker run -it -d -p 8000:8000 python:3.12-slim python -m http.server
curl localhost:8000
- name: run tmate
if: failure()
run: |
tmate -F
Downgrading to Colima 0.5.6 seems to have fixed this issue for me (I chose version 0.5.6 because my coworkers successfully run this version; I haven't tried any of the versions between 0.5.6 and 0.6.7, so I cannot point to a specific version where a regression in Colima may have occurred).
Same issue, I started to set an APEX container emulating x86_64 on my M1 MacBook Pro. Everything was running but I wanted to increase the CPU on the Colima VM, after restarting I could not access to the Database and ORDS website anymore.
~ % colima status
INFO[0000] colima is running using macOS Virtualization.Framework
INFO[0000] arch: x86_64
INFO[0000] runtime: docker
INFO[0000] mountType: virtiofs
INFO[0000] socket: unix:///Users/<username>/.colima/default/docker.sock
I tried to restart the Colima VM even re-create it again with different specs. Noticed that Colima VM do not show any value on the ADDRESS
column when execute colima ls
, even if I include the --network-address
.
~ % colima ls
PROFILE STATUS ARCH CPUS MEMORY DISK RUNTIME ADDRESS
default Running x86_64 8 12GiB 60GiB docker
I found this issue where someone describes the following steps as workaround for this "empty" address:
# 1. Start Colima without network address flag
colima start
# 2. Get into vm
colima ssh
# 3. Disabled IPV6
sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1
# 4. Start Colima as normal
colima start --cpu 8 --memory 12 --arch x86_64 --vm-type=vz --network-address
colima start --cpu 6 --memory 12 --arch aarch64 --vm-type=vz --vz-rosetta --network-address --profile aarch64
~ % colima ls
PROFILE STATUS ARCH CPUS MEMORY DISK RUNTIME ADDRESS
aarch64 Running aarch64 6 12GiB 60GiB docker 192.168.106.2
x86_64 Running x86_64 8 12GiB 60GiB docker
docker run --rm -p 8080:80 nginx
ords % wget --no-check-certificate --spider --server-response http://localhost:8080
Spider mode enabled. Check if remote file exists.
--2024-02-12 20:23:02-- http://localhost:8080/
Resolving localhost (localhost)... 127.0.0.1, ::1
Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: nginx/1.25.3
Date: Tue, 13 Feb 2024 02:23:02 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Tue, 24 Oct 2023 13:46:47 GMT
Connection: keep-alive
ETag: "6537cac7-267"
Accept-Ranges: bytes
Length: 615 [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.
Same issue here. I can confirm that disabling ipv6 and restarting the colima VM, as recommended by pablodaniel03 here helped. Thank you!
Also, same issue here. I need to emulate x86 architecture using colima on my M3 Mac. But I cannot get the forwarded ports to be reachable from outside:
❯ docker port 376c82ad5d8d
9092/tcp -> 0.0.0.0:9092
9092/tcp -> [::]:9092
❯ nc -zv localhost 9092
nc: connectx to localhost port 9092 (tcp) failed: Connection refused
nc: connectx to localhost port 9092 (tcp) failed: Connection refused
Colima is up fine, docker ps shows everything is fine, the docker-compose logs are just fine. Only issue is: I cannot reach the ports. :(
Also running latest Colima 0.6.8
Please help...
I'm seeing similar, and in my debugging all I have been able to determine is that the ssh
process dies somewhere along the way.
All the forwarded ports are then lost, i.e. nothing is listening any longer, any colima ssh
session is terminated with
FATA[0650] exit status 255
The ssh process respawns, but the port config is not restored and after this happens only an ssh based docker context works. It is the ssh process that listens to the unix docker.sock so this listener is not restored when the ssh process respawns, until colima is restarted. Broken state:
kdescoteaux@kdescoteaux-mac cloud % docker ps
Cannot connect to the Docker daemon at unix:///Users/kdescoteaux/.colima/default/docker.sock. Is the docker daemon running?
kdescoteaux@kdescoteaux-mac cloud % docker --context colima ps
Cannot connect to the Docker daemon at unix:///Users/kdescoteaux/.colima/default/docker.sock. Is the docker daemon running?
kdescoteaux@kdescoteaux-mac cloud % docker --context colima-ssh ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
"Recovery"
kdescoteaux@kdescoteaux-mac cloud % colima restart
INFO[0000] stopping colima
INFO[0000] stopping ... context=docker
INFO[0011] stopping ... context=vm
INFO[0012] done
INFO[0015] starting colima
INFO[0015] runtime: docker
INFO[0031] starting ... context=vm
INFO[0042] provisioning ... context=docker
INFO[0043] starting ... context=docker
INFO[0044] done
kdescoteaux@kdescoteaux-mac cloud % docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
kdescoteaux@kdescoteaux-mac cloud % docker --context colima ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
kdescoteaux@kdescoteaux-mac cloud % docker --context colima-ssh ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
Description
After switching from an Intel mac to an ARM mac, the x86_64 Colima VM sometimes ends up in a broken state, where no containers are reachable. Any requests made towards exposed ports receive "connection refused". It can only be resolved by some combination of recreating the VM and rebooting my mac. Restarting the containers or the VM has no effect.
Version
colima version 0.6.7 git commit: ba1be00e9aec47f2c1ffdacfb7e428e465f0b58a
runtime: docker arch: aarch64 client: v24.0.7 server: v24.0.7 limactl version 0.19.1 qemu-img version 8.2.0 Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers
Operating System
Output of
colima status
Reproduction Steps
colima start --cpu 4 --memory 16 --disk 100
andcolima start --cpu 2 --memory 8 --disk 75 --profile x86 --arch x86_64
, respectively.docker run -p 8080:80 nginx
, thentelnet localhost 8080
should answer "Connected to localhost", however answers "Connection refused".Expected behaviour
Expect that healthy containers are reachable.
Additional context
I appreciate any help towards how I can debug this and provide further information.