Open h0jeZvgoxFepBQ2C opened 7 months ago
Related: #6956
@h0jeZvgoxFepBQ2C - when did you see the VM error message pop up?
As far as I remember within the last month. I have somehow the feeling that it has something to do with network interruptions, since its often when I have internet problems, wifi switching or the registry is slow somehow.
@h0jeZvgoxFepBQ2C thanks for your report. I see some interesting network failures in the VM console logs. I've made an experimental build with a new networking option which might work better:
If you'd like to try it the build is here:
Try turning on the new "Use kernel networking for TCP" feature and let me know if the problem reproduces. If it does, could you upload fresh diagnostics and let me know what kind of network requests your code is making to the Mac or the Internet. Are you using host.docker.internal
?
Thanks!
@djs55 it seems that this flag fixes the problem, didn't occur the last days now (which happened daily before)... Cool if its really this!
And yes, I am using host.docker.internal in my docker compose:
services:
anycable-go:
image: anycable/anycable-go:1.4.8
platform: linux/amd64
ports:
- '8080:8080'
- '8090:8090'
- '50051:50051'
environment:
ANYCABLE_DEBUG: 1
ANYCABLE_DISABLE_TELEMETRY: true
ANYCABLE_RPC_HOST: http://host.docker.internal:3000/_anycable
ANYCABLE_HTTP_RPC_SECRET: test123123123123
ANYCABLE_BROKER: memory
ANYCABLE_BROADCAST_ADAPTER: redis
ANYCABLE_REDIS_URL: redis://host.docker.internal:6379/0
ANYCABLE_HOST: "0.0.0.0"
ANYCABLE_PORT: 8080
ANYCABLE_PATH: "/cable"
nats:
image: nats:2.10.11
ports:
- "4222:4222"
- "4333:4333"
- "8222:8222"
volumes:
- ./cluster/dev/nats/nats.conf:/nats-server.conf
redis:
image: bitnami/redis:latest
restart: always
ports:
- '6379:6379'
command: redis-server --loglevel warning --appendonly no --protected-mode no --stop-writes-on-bgsave-error no --save ""
environment:
- ALLOW_EMPTY_PASSWORD=yes
- REDIS_AOF_ENABLED=no
volumes:
- redis-data:/data
postgres:
image: postgres:15
stop_grace_period: 120s
environment:
- POSTGRES_USER=mycompany
- POSTGRES_PASSWORD=mycompany
- POSTGRES_HOST=localhost
ports:
- '5432:5432'
volumes:
- postgres-data:/var/lib/postgresql/data
- ./tmp/kubernetes_dumps:/tmp/kubernetes_dumps
mailcatcher:
image: sj26/mailcatcher
ports:
- "1080:1080"
- "1025:1025"
volumes:
postgres-data:
redis-data:
Thank you @djs55! Your experimental build helped me go through 40+ Docker images with Trivy scan without any hiccups, contrary to the previous experience with the latest release build which would fail after roughly 20 images processed.
Problem still occurs on a M1, Docker Desktop 4.30. I'm not using docker.internal. How can I get to diagnostics? edit: what's weird (at least in my head) is that force quitting Docker Desktop (regular quit didn't work) and restarting Docker Desktop fixes the issue? In my mental model Docker Desktop is a mere frontend to the docker service running on my machine (independent of Docker Desktop).
@sderuiter did you try the experimental build? it worked for me
@sderuiter did you try the experimental build? it worked for me
I did and that worked for 24 hours, but failed again. Have now turned on the SOCKS checkbox as found elsewhere.
Also not working me.
@djs55
I tried the experimental version and it did not work for me. Docker Desktop still locks up every 20-30 minutes and I get the same error as OP, except the version is 1.45 instead of 1.24.
I believe this is the same as #7288.
@sderuiter did you try the experimental build? it worked for me
I did and that worked for 24 hours, but failed again. Have now turned on the SOCKS checkbox as found elsewhere.
@sderuiter Did the SOCKS checkbox work for you?
Tried without enabling SOCKS first, failed as it did before.
After enabling it, failed with the following:
failed to resolve source metadata for docker.io/library/
@djs55 @ctalledo It just stops working multiple times a day. It requires full computer restart to recover and often times that doesn't even fix it. It's unusable at this point.
I'm on M1 Max MBP running MacOS Sonoma 14.6.1
and Docker v4.34.2
.
Guys-MacBook-Pro:docker guy$ docker compose up
request returned Internal Server Error for API route and version http://%2FUsers%2Fguy%2F.docker%2Frun%2Fdocker.sock/v1.46/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Ddocker%22%3Atrue%7D%7D, check if the server supports the requested API version
Guys-MacBook-Pro:docker guy$ docker ps
request returned Internal Server Error for API route and version http://%2FUsers%2Fguy%2F.docker%2Frun%2Fdocker.sock/v1.47/containers/json, check if the server supports the requested API version
Diagnostics ID: 70FCD948-1440-43E5-BF62-CA6D90AA2F69/20241004000157
Diagnostics ID: 70FCD948-1440-43E5-BF62-CA6D90AA2F69/20241004001109
Hi @akhanalcs, thanks for reporting and uploading the diagnostics bundles.
For the 70FCD948-1440-43E5-BF62-CA6D90AA2F69/20241004000157
bundle, I do see the following errors inside the Docker Desktop VM:
log/vm/init.log :[2024-10-03T22:42:46.795815667Z][init.fs ][E] unknown event action type 0
We will investigate and get back to you ASAP.
Thanks!
Hi @ctalledo , thank you for your response. I think I see what causes it, but not sure why.
Docker is not in resource saver mode. Whenever I mount a volume that contains a rclone mount, the issue arises.
Allow me to explain with an example. My docker-compose.yaml
file has someservice
like below:
someservice:
image: someone/someservice
container_name: someservice
restart: unless-stopped
ports:
- 6500:6500
environment:
- PUID=1000
- PGID=1000
- TZ=${TIMEZONE}
volumes:
- ${CFG_ROOT_DIR}/docker/appdata/someservice:/data/db
- ${DATA_ROOT_DIR}/data:/data
# The problem arises here. 'somefolder' on host is created by rclone mount as shown in rclone command below
- ${DATA_ROOT_DIR}/data/somefolder:/data/somefolder
Rclone mount
Guys-MacBook-Pro:rclone guy$ ./rclone serve webdav zurg: --addr 0.0.0.0:8080 --dir-cache-time 20s --vfs-cache-mode full --vfs-cache-max-size 20G --cache-dir=“/Users/guy/hms/data/tmp”
Guys-MacBook-Pro:rclone guy$ /sbin/mount_webdav http://127.0.0.1:8080 /Users/guy/hms/data/somefolder
My guess is that Docker Engine is crashing when it has to deal with a remote mount?
When I stop the mount, the docker engine works fine.
Here's error dialog:
Uploaded diagnostics to: https://docker-pinata-support.s3.amazonaws.com/incoming/3/D330BA39-7D85-4881-B1B7-49F47203A2F9/20241005005442.zip
Diagnostics ID: D330BA39-7D85-4881-B1B7-49F47203A2F9/20241005005442
(uploaded)
Also the logs: docker-logs.zip
Thank you for your help!
Thanks @akhanalcs , that's useful info.
Seems like the rclone-backed mount into the container is causing the Docker Desktop Linux VM to crash, although I'm not sure why. Will investigate further.
I believe I had a similar issue. My Docker engine was hanging at intervals of 20-30 minutes, and the error dialogs were the same. After reading the posts above, I started checking all the external mount points. I have a stack with Sonarr, Radarr, etc., where I initially had simple volume mounts like this:
yaml
volumes:
- ./config-sonarr:/config
- /Volumes/RAID/DOCKER/JELLYFIN/anime-seriale:/anime-seriale # optional
- /Volumes/RAID/DOCKER/JELLYFIN/scifi-seriale:/scifi-seriale # optional
- /Volumes/RAID/DOCKER/JELLYFIN/downloads:/downloads # optional
Now, I have replaced the simple mount points with bind mounts:
yaml
- type: bind
source: /Volumes/RAID/DOCKER/JELLYFIN/anime-filmy
target: /data/anime-filmy
- type: bind
source: /Volumes/RAID/DOCKER/JELLYFIN/anime-seriale
target: /data/anime-seriale
I believe this has solved the problem. It’s been 2 hours now without any errors. Hope this will help.
@edek51 Thank you for the response. But both methods are bind mounts. The first method is a shorthand syntax, while the second method is the extended syntax. Both achieve the same result of mapping host directories to container directories.
Out of desperation, I still gave that a try, but no dice.
At this point, I'm giving up on running my media-server on a Intel Mac Mini, I'm looking at installing Ubuntu Server on it.
I plan to pursue the same approach. My setup is hanging in 2-4 hour intervals now. My previous enthusiasm was premature. The issue on Windows and Mac seems to be the necessity of running Linux in a virtual environment. Recently, I've been experimenting with Podman, but it's hanging on my Mac Pro 6.1 as well. Maybe it is problem with The machine in the end...
Description
Docker very often crashes now and I need to purge all data completely, I get following error on the command line:
request returned Internal Server Error for API route and version http://%2FUsers%2Fmyuser%2F.docker%2Frun%2Fdocker.sock/v1.24/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Dcompany-admin%22%3Atrue%7D%7D, check if the server supports the requested API version
This happens independent from what I'm working, i don't change anything, it just breaks here and then :(
Reproduce
Can't reproduce, happens here and then
Expected behavior
It should not break and I shouldnt be forced to purge everything. This happens really often, I have to purge everything every 2 days at least.. Sometimes it only works for 10 minutes..
docker version
docker info
Diagnostics ID
687E3EC9-F982-48C6-B899-D0D50D88E74D/20240408084211
Additional Info
This is really frustrating, I cannot use docker right now when I have to purge everything every 1-2 days or sometimes multiple times a day.. Thank you for your support! ❤️