Open rg9400 opened 3 years ago
Thanks for the update, and I will file a support request. Apart from anything else, their answer ignores that this is an issue for users who aren't on Windows. Just restarting Docker Desktop all the time isn't an acceptable workaround IMO.
I suspect I am running into this at the moment. If having to restart the VM that Docker is running in, rebooting in essence, is not a blocker, what is? Hardware damage?
This is absolutely a blocker for me, as I cannot run scheduled tasks reliably.
The following workaround resolved the issue for me https://emerle.dev/2022/05/06/the-nasty-gotcha-in-docker/
The following workaround resolved the issue for me https://emerle.dev/2022/05/06/the-nasty-gotcha-in-docker/
Adding an archive in case the post or site goes down.
While this is useful information, I am not sure that it's actually related to this bug. The error described in the post is "Connection reset by peer." However, the problem in this issue is "Connection timed out." The exact error may differ depending on which software you're using, but the key thing is that you send packets that just never arrive. The connection isn't reset, it just stops moving data and effectively becomes /dev/null
.
There are reproduction steps here, and I'm happy to be proven wrong. If someone can run the Python reproduction above and confirm that the problem doesn't occur on recent versions of Docker Desktop with the idle time set to 0
, then I'll stand corrected. But @rg9400 spoke with Docker themselves, who acknowledged the problem and said they didn't have a fix. If the solution was as easy as changing vpnKitMaxPortIdleTime
, surely they would have mentioned that.
If you would like changes in the behavior of vpnKitMaxPortIdleTime
, I suggest you open a different issue.
I also replied a few months ago with that fix, and my problem was a connection time out for an nginx reverse proxy and PING command, not a connection reset.
I'm thinking this is a port saturation issue, similar to what's described here. I recently restarted my Docker service, but once the problem crops up again, I'll try going through some of these troubleshooting steps.
I'm about 90% sure this issue applies to me as well, but it's devilishly difficult to tell for sure. I'll refer to a tool for reproduction that I wrote in my observations below:
host.docker.internal
, perhaps mainly because nearly all of my requests are addressed there, but while troubleshooting I was unable to reproduce when sending requests to an IP (using Docker's default bridge network) nor a service name (using a custom bridge network created for the purpose) -- see the reproduction repo for more notes.nginx
containerrm -f
the client+server containers, start a new client container with a slightly different image, and have the issue reproducing within the first 100 requests at one time on my laptophost.docker.internal
.We are running Docker version Docker version 20.10.22, build 3a2c30b
on Ubuntu 22.04.2 LTS
and are experiencing the same issue.
We are running a node-red flow which queries a mssql server every 5 minutes, and randomly the connection to the SQL server just gets a 30000ms timeout, the next attempt will be successful..
We are experiencing same issue, almost every 10 minutes, SQL queries from our containers getting slower, then it resolves until the next 10 minute period.
Docker Desktop version v4.17.0 Windows Server 2022 - WSL2 1.0.3.0 backend
is there any update on this?
The following workaround resolved the issue for me https://emerle.dev/2022/05/06/the-nasty-gotcha-in-docker/
I had also been experiencing this for several months. Doing this workaround appears to have fixed the issue.
Got this issue with windows 11 on WSL and Docker version 23.0.3, build 3e7cbfd
We are running to server so this error becomes untenable.
Please note that an experimental build of vpnkit has been released in this parallel issue which attempts to resolve what may be the underlying problem here. Users experiencing this should install the experimental builds if possible and feed back to @djs55 in the vpnkit issue as to whether the problem is resolved, and if you notice any side effects.
Per my testing of the experimental build, the issue is significantly improved but not resolved. There are still timeouts, just a lot less. When running thousands of curls, I still notice stuck handshakes that don't instantly close but take a minute or two to resolve. The difference is that most such instances do clear out before the timeout.
I just still wanted to confirm that the connections still are getting stuck even if the overall symptoms are a lot better
I believe I am facing this same problem on MacOS Sonoma 14.1.1, running Docker Desktop for Mac (Apple Silicon) 4.25.2.
I want to try downgrading to 4.5.0 (it's insane the issue is going on that long). Does anybody have an install file? The oldest available here is 4.9.1.
EDIT: Docker Desktop for MacOS (Apple Silicon) can be downloaded here.
EDIT2: Confirmed, downgrading fixed the issue. I’ve been running with stable connections for weeks now.
Facing the same issue on Debian 12. Checked ufw logs and whitelisted container's IP address with sudo ufw allow from 172.17.0.2
, this fixed it.
Expected behavior
I would expect services running inside Docker containers in a WSL backend to be able to reliably communicate with applications running on the host, even with frequent polling
Actual behavior
Due to https://github.com/docker/for-win/issues/8590, I have to run some applications that require high download speeds on the host. I have multiple applications inside Docker containers running inside a Docker bridge network that poll this application every few seconds. When launching WSL, the applications are able to communicate reliably, but this connection deteriorates over time, and after 1-2 days, I notice frequent
connection timed out
responses from the application running on the host. Runningwsl --shutdown
and restarting the Docker daemon fixes the issue temporarily. Shifting applications out of Docker and onto the host fixes their communication issues as well. It may be related to the overall network issues linked above.To be clear, it can still connect. It just starts timing out more and more often the longer the network/containers have been up.
Information
I have had this problem ever since starting to use Docker for Windows with the WSL2 backend.
Steps to reproduce the behavior