Open rg9400 opened 3 years ago
This seems to improve if you use the recommended host.docker.internal
option instead of using the IP of the host machine directly
Further update on this. While the above does prolong the deterioration, it still eventually happens. After 4-5 days, timeouts start occurring at increasing frequency, with it eventually reaching a point where timeouts are happening on almost every few calls, requiring a full restart of WSL and Docker to function.
We have the same issue
We have a service running on the host.
If i try to hit host.docker.internal from within a linux container i can always get it to trip up eventually after say 5000 curl requests to http:\host.docker.internal\service (it timesout for one request)
If i try http:\host.docker.internal\service from the host, it works flawlessly even after 10000 curl requests
Sometimes, intermittently, and we can't find out why, it starts to fail much more frequently (like maybe every 100 curl requests)
Something is up with the networking...
Here is a very simple test to show what's going on:
In my limited testing, i created a loopback adapter and used it instead. I created an ip 10.0.75.2 and used it instead. It's much more reliable. It's an ugly work around but it might work at least to help show where the issue might be.
Hey guys, this is still happening pretty consistently. Is anyone looking at the reliability/performance of these things? Is this the wrong place to post this?
I was able to send this via their support and have them reproduce the issue. They diagnosed the cause, but said it would involve some major refactoring, so they didn't have a target fix date. Below is the issue as mentioned by them
I can reproduce the bug now. If I query the vpnkit diagnostics with this program https://github.com/djs55/bug-repros/tree/main/tools/vpnkit-diagnostics while the connection is stuck then I observe: (for my particular repro the port number was 51580. I discovered this using wireshark to explore the trace)
$ tcpdump -r capture\\all.pcap port 51580 15:57:03.021934 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195077730 ecr 0,nop,wscale 7], length 0 15:57:04.064094 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195078771 ecr 0,nop,wscale 7], length 0 15:57:06.111633 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195080819 ecr 0,nop,wscale 7], length 0 15:57:10.143908 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195084851 ecr 0,nop,wscale 7], length 0 15:57:18.464142 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195093171 ecr 0,nop,wscale 7], length 0 15:57:34.848536 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195109555 ecr 0,nop,wscale 7], length 0 15:58:07.103411 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195141811 ecr 0,nop,wscale 7], length 0
which is a stuck TCP handshake from the Linux point of view. The same thing is probably visible in a live trace from
docker run -it --privileged --net=host djs55/tcpdump -n -i eth0
.Using sysinternals process explorer to examine the vpnkit.exe process, I only see 1 TCP connection at a time (although a larger than ideal number of UDP connections which are DNS-related I think). There's no sign of a resource leak.
When this manifests I can still establish other TCP connections and run the test again -- the impact seems limited to the 1 handshake failure.
The vpnkit diagnostics has a single TCP flow registered:
> cat .\flows TCP 192.168.65.3:51580 > 192.168.65.2:6789 socket = open last_active_time = 1605023899.0
which means that vpnkit itself thinks the flow is connected, although the handshake never completed.
Woah, thanks for this update @rg9400. Glad you got it on their radar. So your work around is to restart docker and wsl --shutdown? I've been trying to use another IP (loopback adapter) as opposed to host.docker.internal or whatever host.docker.internal points to. But I'm not 100% sure that solves the problem permanently. Maybe its just a new IP so it will work for a little and then deteriorate again over time. Based on your explanation of the root cause, that might indeed be the case.
Yeah, for now I am just living with it and restarting WSL/Docker every now and then when the connection timeouts become too frequent and unbearable.
What can we do to get this worked on. Is there work happening on it? or a ticket we can follow? This still bugs us quite consistently.
I want to keep this thread alive as this is a massive pain for folks especially because they don't know its happening. This needs to become more reliable.
Here is a newer diagnostic id: F4D29FA0-6778-40B8-B312-BADEA278BB3B/20210521171355
Also discovered that just killing vpnkit.exe in task manager reduces the problem. It restarts almost instantly and connections resume much better without having to restart containers or anything. But problem eventually reoccurs.
We have about 15 services in our docker-compose file and all of them do an npm install
. A cacheless build is impossible because it tries to build all the services at once and the npm install steps timeout because trying to download that many packages just kills bandwidth.
I'm not using the --parallel
flag
I've set the following environment variables:
But non of this seems to change the behavior
This happens on macOS too, in fact quite reliably after ~7 minutes and ~13,000 requests of hitting a HTTP server:
Server:
$ python3 -mhttp.server 8015
Client (siege):
$ cat <<EOF > siegerc
timeout = 1
failures = 1
EOF
$ docker run --rm -v $(pwd)/siegerc:/tmp/siegerc -t funkygibbon/siege --rc=/tmp/siegerc -t2000s -c2 -d0.1 http://host.docker.internal:8015/api/foo
Output:
New configuration template added to /root/.siege
Run siege -C to view the current settings in that file
** SIEGE 4.0.4
** Preparing 2 concurrent users for battle.
The server is now under siege...[alert] socket: select and discovered it's not ready sock.c:351: Connection timed out
[alert] socket: read check timed out(1) sock.c:240: Connection timed out
siege aborted due to excessive socket failure; you
can change the failure threshold in $HOME/.siegerc
Transactions: 13949 hits
Availability: 99.99 %
Elapsed time: 378.89 secs
Data transferred: 6.24 MB
Response time: 0.00 secs
Transaction rate: 36.82 trans/sec
Throughput: 0.02 MB/sec
Concurrency: 0.10
Successful transactions: 0
Failed transactions: 1
Longest transaction: 0.05
Shortest transaction: 0.00
What's interesting is that it gets progressively worse from there, the timeouts happen more and more frequently. Restarting the HTTP server doesn't help, but restarting it on another port does (e.g. from 8019 -> 8020). From there you get another 7 minutes of 100% success before it starts degrading again.
I tried adding an IP alias to my loopback adapter and hitting that instead of host.docker.internal
but it had the same behavior (i.e. degraded after 7 minutes). The same goes for using the IP (192.168.65.2) and skipping the DNS resolution.
This issue remains unresolved. The devs indicated it required major rework, but I haven't heard back from them in 6 months on the progress.
Issues go stale after 90 days of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
comment.
Stale issues will be closed after an additional 30 days of inactivity.
Prevent issues from auto-closing with an /lifecycle frozen
comment.
If this issue is safe to close now please do so.
Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. /lifecycle stale
/remove-lifecycle stale
I am also affected by this issue. I thought at one point it was because of TCP keepalive on sockets, and the sockets not being closed as fast as they are opened, thus a exhausting the max number of available sockets. But the problem doesn't go away even if my containers stop opening connections for a while, only a restart of docker and wsl seems to fix this. This issue should be on high priority...
I cannot connect from a container to a host port even using telnet. Network mode is bridge, which is default, but "host" mode also doesn't work.
I tried to guess host IP, but also I tried this: extra_hosts:
Telnet connection from host machine to this host port does work well.
In previous Docker versions it was working fine! Seems it's broken since some update maybe from 2021-2022.
Upd. It was my Ubuntu UFW that was blocking containers from connecting to host ports.
Having this exact problem on MacOS. Restarting Docker fixes the problem (for a while).
We have reports of this occurring across teams on Windows and macOS as well. We have no reports of this issue occuring on Linux.
Someone noticed that on macOS, simply waiting ~15mins often alleviates the problem.
We're also experiencing this (using host.docker.internal) on Docker Desktop for Windows. Strangely enough, Docker version up to 4.5.1 seem to work fine, but versions 4.6.x and 4.7.x instantly bring up the problem. Connections work for some time, but then the timeouts start. All checks of "C:\Program Files\Docker\Docker\resources\com.docker.diagnose.exe" check
pass.
I'm experiencing the same problem with increasing amount of timeouts over time while using host.docker.internal.
I'm also experiencing the same problem. Downgrade to 4.5.1 looks that solves the issue.
Any update on this issue? I'm experiencing the same. Restarting the container does not fix it. Only restarting the daemon/host resolves it.
We seem to have resolved the issue on Windows (but not Mac)
We previously had the following configuration in our compose file to allow containers to reach the host using "host.docker.internal" on Windows, Mac and Linux hosts:
extra_hosts:
- "host.docker.internal:host-gateway"
Removing this configuration resolved the time out issue on Windows (but can obviously cause other problems). Mac users still have time out issues, though.
We are encountering issues with this on MacOS 12.0
. We determined that our developers using Docker Desktop 4.3.0
have not encountered the issue. We are currently testing downgrading Docker Desktop to 4.3.0
. This seems to have resolved the problem so far. We have not yet tested going all the way back up to 4.5.1
as noted earlier in this thread. We also have not yet observed this issue in Docker on our x86
Ubuntu environments.
I was able to send this via their support and have them reproduce the issue. They diagnosed the cause, but said it would involve some major refactoring, so they didn't have a target fix date. Below is the issue as mentioned by them
I can reproduce the bug now. If I query the vpnkit diagnostics with this program https://github.com/djs55/bug-repros/tree/main/tools/vpnkit-diagnostics while the connection is stuck then I observe: (for my particular repro the port number was 51580. I discovered this using wireshark to explore the trace)
$ tcpdump -r capture\\all.pcap port 51580 15:57:03.021934 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195077730 ecr 0,nop,wscale 7], length 0 15:57:04.064094 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195078771 ecr 0,nop,wscale 7], length 0 15:57:06.111633 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195080819 ecr 0,nop,wscale 7], length 0 15:57:10.143908 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195084851 ecr 0,nop,wscale 7], length 0 15:57:18.464142 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195093171 ecr 0,nop,wscale 7], length 0 15:57:34.848536 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195109555 ecr 0,nop,wscale 7], length 0 15:58:07.103411 IP 192.168.65.3.51580 > 192.168.65.2.6789: Flags [S], seq 609899732, win 64240, options [mss 1460,sackOK,TS val 2195141811 ecr 0,nop,wscale 7], length 0
which is a stuck TCP handshake from the Linux point of view. The same thing is probably visible in a live trace from
docker run -it --privileged --net=host djs55/tcpdump -n -i eth0
. Using sysinternals process explorer to examine the vpnkit.exe process, I only see 1 TCP connection at a time (although a larger than ideal number of UDP connections which are DNS-related I think). There's no sign of a resource leak. When this manifests I can still establish other TCP connections and run the test again -- the impact seems limited to the 1 handshake failure. The vpnkit diagnostics has a single TCP flow registered:> cat .\flows TCP 192.168.65.3:51580 > 192.168.65.2:6789 socket = open last_active_time = 1605023899.0
which means that vpnkit itself thinks the flow is connected, although the handshake never completed.
@rg9400 this was SUPER helpful... I started running into the same issue. I use dockerized Jupyter on Docker for Windows for a significant amount of my day-to-day work and have been getting CONSTANT timeout errors when I run notebooks from the beginning. I was also restarting Docker a ton, but after finding this comment, I found a way to pretty consistently "unstick" things (though it's definitely still annoying):
C:\Users\blah\blah> tasklist | findstr vpnkit.exe
C:\Users\blah\blah> taskkill /F /pid <pid of vpnkit>
And then I give it just a sec and when the cell tries again to reestablish connections, it's good.
I don't really think this is a viable solution for folks running processes that are constantly establishing connections, but it works for me currently for Jupyter (once I get data, I don't really need more connections for the notebook though).
Any update from folks working on this issue? I found something that sounds similar from 2019, but it doesn't look like anyone is making any effort to resolve it on the vpnkit side that I can find right off from searching the issues.
Having this exact problem on MacOS. Restarting Docker fixes the problem (for a while).
I'm having the same issue, running docker 4.9.1 on mac and I'm facing the issue very often. After restarting Docker it works again but is not a long term solution as you mentioned ...
I'm also affected by this issue. It seems to hang after just a few minutes and causes network timeouts. restarting docker fixes it for a few more minutes. it's practically unusable...
I tried to use the internal IP of docker host (instead of "host.docker.internal"), but the problem still occurs: in few minutes, the network connection timeout starts again. Just stop and start the container doesn´t fix the issue, only if i recreate the container. I´m working with Windows Docker Desktop, v.4.9.1, updated today!
Just run the command
taskkill /im vpnkit.exe /f
and connection with docker host is fixed for more few minutes.
I can confirm that by downgrading docker desktop for mac to version 4.5.0
"fixes" the problem and I no longer have connection issues. I tested for several days without any problems. Then upgraded back to the latest version and started getting timeouts and connection failures almost immediately again.
+1
I’ve had this issue since 4.5.0 as well. I’m pinned to that release until it’s fixed.
On Sat, Jun 25, 2022 at 2:25 AM Ray Marceau @.***> wrote:
I can confirm that by downgrading docker desktop for mac to version 4.5.0 "fixes" the problem and I no longer have connection issues. I tested for several days without any problems. Then upgraded back to the latest version and started getting timeouts and connection failures almost immediately again.
— Reply to this email directly, view it on GitHub https://github.com/docker/for-win/issues/8861#issuecomment-1166203516, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJSHA5TXEYATIVIREMP6NDVQ2Q67ANCNFSM4SGL23LQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Like reported in some quotes here, since a I had a docker 4.9.1 installed, I downgraded docker windows to 4.5.1. The connection timeout doesn´t occur anymore.
While the issue title only refers to connections to the host, it also happens to any other address (independent of internal or external).
In 4.10 release notes, I didn´t see any mention about this issue. Did anybody check if the 4.10 version still has this issue?
I'm currently working with Laravel so in order to migrate database I have to use 127.0.0.1 and for connection to work I have to use host.docker.internal
DB_HOST=host.docker.internal Windows 10 , Docker Desktop 4.10.1 (82475)
We have reports of this occurring across teams on Windows and macOS as well. We have no reports of this issue occuring on Linux.
Someone noticed that on macOS, simply waiting ~15mins often alleviates the problem.
As of Docker version 20.10.17, build 100c701 on Ubuntu 22.04, it happens on Linux.
I'm using Docker to run Kiwix scrapers. They work while then start hitting timeouts. If the scraped host has multiple IP addresses, it can round-robin and keep running. If the host only has one, the task is likely to fail from too many timeouts.
Everyone on my team with Mac (we only have mix of Mac and Linux) are also experiencing this problem. It manifests itself after 1-3 days without restarting, but during that timeframe we probably have on the order of 10,000 outbound HTTP calls, similar to what was mentioned in https://github.com/docker/for-win/issues/8861#issuecomment-903421446.
This seems like a very serious issue if it happens so consistently.
Our solution was to stop using Docker for certain containers with high traffic volume on (non-Linux) developer machines.
I have also seen this issue in versions >4.5.1 including the latest version, but have found that it can also be triggered with low amounts of traffic. The following is how I have been able to reproduce the issue.
docker run --name session-test -it -v /mnt/c/Users/jde/sessions/test:/test python:buster bash
root@17c8b33e70e6:/# pip install --quiet requests
root@17c8b33e70e6:/# cd test/
root@17c8b33e70e6:/test# python sessions.py
10:52:18: request 1
Request complete, sleep 30
10:52:50: request 2
Request complete, sleep 120
10:54:51: request 3
Request complete, sleep 420
11:01:51: request 4
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/usr/local/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/usr/local/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out
sessions.py
import requests
from datetime import datetime
import time
s = requests.Session()
steps = [30, 120, 420, 420]
step = 1
for i in steps:
print(datetime.now().strftime("%H:%M:%S") + ": request " + str(step))
r3 = s.get('https://wttr.in')
print("Request complete, sleep " + str(i))
step+=1
time.sleep(i)
As has been mentioned before if I look at a trace from the containers point of view I only see TCP SYNs being sent out during the 4th attempt after waiting 420s since the last request. Also if I kill the vpnkit while it is still trying the 4th attempt then when the vpnkit starts back up the 4th requests is able to complete successfully.
Some things that I have noticed that I do not think were previously mentioned. If I look at a trace from the host I see the TCP SYNs going out and TCP SYN ACKs coming back from the server, but these are not passed on to the container. If I start up another container while the first is trying unsuccessfully to do the 4th attempt it also is not able to reach the same destination, but is able to reach other destinations.
docker run -it python:buster bash
root@14437db6e250:/# curl https://wttr.in
curl: (7) Failed to connect to wttr.in port 443: Connection timed out
root@14437db6e250:/# curl https://google.com
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="https://www.google.com/">here</A>.
</BODY></HTML>
root@14437db6e250:/# curl https://wttr.in
curl: (7) Failed to connect to wttr.in port 443: Connection timed out
The cause of the issue seems to have something to do with using sessions and having a client side keep alive interval being >=60s. If I change to a 30s client keep alive interval I do not run into the issue.
docker run --name session-test -it -v /mnt/c/Users/jde/sessions/test:/test python:buster bash
root@425db0e6590a:/# pip install --quiet requests
root@425db0e6590a:/# cd test/
root@425db0e6590a:/test# python sessions-ka30.py
11:41:35: request 1
Request complete, sleep 30
11:42:06: request 2
Request complete, sleep 120
11:44:06: request 3
Request complete, sleep 420
11:51:06: request 4
Request complete, sleep 420
root@425db0e6590a:/test#
sessions-ka30.py
import requests
from datetime import datetime
import time
import socket
from requests.adapters import HTTPAdapter
class HTTPAdapterWithSocketOptions(HTTPAdapter):
def __init__(self, *args, **kwargs):
self.socket_options = kwargs.pop("socket_options", None)
super(HTTPAdapterWithSocketOptions, self).__init__(*args, **kwargs)
def init_poolmanager(self, *args, **kwargs):
if self.socket_options is not None:
kwargs["socket_options"] = self.socket_options
super(HTTPAdapterWithSocketOptions, self).init_poolmanager(*args, **kwargs)
KEEPALIVE_INTERVAL = 30
adapter = HTTPAdapterWithSocketOptions(socket_options=[(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1),
(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, KEEPALIVE_INTERVAL), (socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, KEEPALIVE_INTERVAL)])
s = requests.Session()
s.mount("http://", adapter)
s.mount("https://", adapter)
steps = [30, 120, 420, 420]
step = 1
for i in steps:
print(datetime.now().strftime("%H:%M:%S") + ": request " + str(step))
r3 = s.get('https://wttr.in')
print("Request complete, sleep " + str(i))
step+=1
time.sleep(i)
I hope this information helps in resolving the issue or provides a work around for others experiencing it.
I have also added this information to https://github.com/moby/vpnkit/issues/587
V4.5.1 works fine. Than I removed the docker desktop windows, cleaned all the junk files left and reinstalled the version v.4.12.0. And the connection problem started again.
Randomly the host docker connection stops.
I am using Docker Desktop 4.11.1 (84025) on a MAC. I am running 4 python processes in separate containers and when the processes have run for a day or two I am starting to get timeouts.. Our connection timeout is set to 300 seconds.. so the delay (if there is any network traffic at all) is significant.. Connections to applications on localhost seems to work ok as we have some internal traffic which is not affected.
A restart of the Docker Desktop application solves the problem and the connection is back up again. There is no network issues on the host machine as I can reach the URL we try to reach via the containers the whole time.
We have added this in our docker-compose file extra_hosts:
My team is using Docker Desktop on both Macs and Windows in multiple countries, and I beleive we have been seeing this for many months. It manifests as having connections to two AWS hosts hang after awhile: ssm.us-east-2.amazonaws.com
and cognito-idp.us-east-2.amazonaws.com
. Other hosts work fine, and the two hosts are reachable from outside Docker. After 5–10 minutes, connections stop hanging. But the longer the Docker process has been running, the more frequently the connection failures occur.
Restarting the container doesn't fix the problem. Restarting the Docker process stops it from happening for awhile.
I was on a series of recent versions of Docker Desktop for Mac, most recently 4.13.0 I think, and I was seeing the problem reliably. After finding this issue, I've gone back to 4.5.0. Things have been fine for a day, so I'm hoping that's a workaround while we wait for a fix.
I am concerned that this issue doesn't have the priority it deserves. I'm not doing anything particularly intensive with my container, just running a web app. We don't even hit the AWS servers that often. I think many people are probably seeing this and blaming it on ISP/network issues or hosts being down, when in fact the problem is Docker itself. It took me months of on-and-off debugging (because the problem is so intermittent) to finally look for a GitHub issue here; I thought it was something to do with the way I was calling the AWS hosts. I get that the solution is hard, but I'd like to see Docker at least announce that they have a plan to address this after 2 years of steadily increasing impact.
I think the only way to increase visibility is to submit official bug reports to the team linking to this issue. My official ticket with them (where the issue was discovered) keeps asking me to confirm it is still an issue to prevent auto-closure, and I have not been able to get a team's update beyond one from a few months ago where they asked if this had resolved itself (to which I replied it had not).
This issue was fixed for me on MacOS after editing ~/Library/Group\ Containers/group.com.docker/settings.json
and setting vpnKitMaxPortIdleTime
from 300
to 0
(Docker Desktop restart required after). I have changed this over a week ago and till now I did not encounter the issue again.
I don't know how this can be changed in windows tho
The reason I mentioned contacting Docker was because this was my latest communication on them, which you can see is pretty frustrating
I did some investigation on this one.
We have parsed through the thread and users seem to be able to address the issue by restarting WSL, so it’s not a blocker. A ticket has been created from the begining (2 years ago) but as it is not a blocker, we don’t see us getting to this over the course of the next few months. Once resolved we will update the issue on github.
Sorry for the incovenience.
Doscker Support
Thanks for the update, and I will file a support request. Apart from anything else, their answer ignores that this is an issue for users who aren't on Windows. Just restarting Docker Desktop all the time isn't an acceptable workaround IMO.
Sorry for the incovenience. Doscker Support
Lol they can't even spell their own product sigh...
Just a reminder that this is actually an issue with vpnkit and also affects Mac. That partially might be the reason why no one of the docker for win team feels responsible fixing this.
We need to actively raise this on the vpnkit project and not here!
e.g. you can voice your support here: https://github.com/moby/vpnkit/issues/587
Expected behavior
I would expect services running inside Docker containers in a WSL backend to be able to reliably communicate with applications running on the host, even with frequent polling
Actual behavior
Due to https://github.com/docker/for-win/issues/8590, I have to run some applications that require high download speeds on the host. I have multiple applications inside Docker containers running inside a Docker bridge network that poll this application every few seconds. When launching WSL, the applications are able to communicate reliably, but this connection deteriorates over time, and after 1-2 days, I notice frequent
connection timed out
responses from the application running on the host. Runningwsl --shutdown
and restarting the Docker daemon fixes the issue temporarily. Shifting applications out of Docker and onto the host fixes their communication issues as well. It may be related to the overall network issues linked above.To be clear, it can still connect. It just starts timing out more and more often the longer the network/containers have been up.
Information
I have had this problem ever since starting to use Docker for Windows with the WSL2 backend.
Steps to reproduce the behavior