docker / for-win

Bug reports for Docker Desktop for Windows
https://www.docker.com/products/docker#/windows
1.87k stars 291 forks source link

WSL2 backend has a potential memory leak #13022

Open ghost opened 2 years ago

ghost commented 2 years ago

NOTE: I am sure this is not a bug with WSL2's memory handling, more on this later in the issue.

Running Windows 10 build 19044, with Docker Desktop v4.12.0 has a problem where the memory consumption eventually eats up until it hits the WSL memory usage cap (80% of the total in this version). Most people on the internet with this same problem report it happens over the span of days, but I've had it happen multiple times a day:

Even if I don't have any containers open, it will eventually grow up to use all available memory.

At first, I thought it was a problem with WSL2's memory management itself, and that it somehow wouldn't return memory back to Windows after being freed, but this suspicion was disproved once a friend recommended me to directly check in the docker-desktop distribution inside WSL2, where running the free -g command showed that it was in fact using 12 GB of memory itself, even though no processes running were using more than 1 GB of resident memory, or 2 GB of virtual memory.

I am afraid that giving WSL a lesser memory cap (I have it set up at 4 GB right now, but I have not tested yet) will result in a degradation in performance for Docker and other WSL instances.

If needed, my diagnostic ID is 6F945EC0-7A97-4403-B4D2-ED5297F21DD1/20221020001140

joe0BAB commented 2 years ago

Hi @94tx! Thanks for reporting the issue. Where do you observe the massive memory usage? What does the status in the docker dashboard footer say?

david-engelmann commented 2 years ago

@94tx I've noticed similar patterns with using VS Code with Docker Desktop. Just loading VS Code spins up a ton of processes eventually maxing out the resources

ghost commented 2 years ago

Where do you observe the massive memory usage? What does the status in the docker dashboard footer say?

I can see it both on the Docker dashboard, inside the docker-desktop WSL distribution, and in the task manager attributed to Vmmem

And the Docker status just displays as if it were working as normal (because sans the memory issue, it is working as normal and I can open containers, etc.)

screenshots (I apologize for my poor handwriting)

joe0BAB commented 2 years ago

Thanks a lot @94tx! That already helps me better understand your situation. There could be multiple causes, either something in the VM or some container going wild. We are about to release an extension that will help you check CPU/RAM usage per container. Probably it will be released end of next week. In the meantime, you could check running https://github.com/google/cadvisor which also shows CPU/RAM load per container.

ghost commented 2 years ago

Thanks for the recommendation; how would I go around running cAdvisor? I haven't been able to get it to show me any containers that aren't (what I suppose to be) internal Docker containers

valiko-ua commented 2 years ago

I confirm what @david-engelmann said. Launching VS Code with WSL extension and opening some cmake-based folder on WSL distro (Ubuntu 22.04) quickly consumes all available memory. I had to limit it using .wslconfig.

1 2

Note that according to Task Manager, VS Code itself does not use 13GB of RAM but VM does.

Rainson12 commented 2 years ago

the same thing here, i always thought this is normal... the same thing happens when running some big kubernetes cluster using docker desktop and then just delete everything using kubectl delete sts,deployments,pvc --all the wsl2/docker-desktop will still consume all the memory and doesnt free anything

sandorkazi-epam commented 2 years ago

We are seeing similar things. Out of memory symptoms with the 4.13.0 version.

david-engelmann commented 2 years ago

Any progress here?

BastLast commented 1 year ago

hi ! I'm having a similar issue here and it seems I am not the only one ( https://github.com/microsoft/WSL/issues/8725 )

joe0BAB commented 1 year ago

Would you mind providing more details of the moment you are having the issue so we can narrow it down a bit?

I suggest the following debugging steps:

If this doesn't show any high memory usage container, let's try a second debugging approach:

cc @94tx @david-engelmann

BastLast commented 1 year ago

The memory leak does not seem to come from a specific container I see normal memory usage from the extension
image

I have 20+ containers... not sure it comes from any of them since other members of the team with older version of docker / wsl seem to not have any issue..

I will have to retest later probably because it takes a few hours for the leak ton fill up the ram :/

joe0BAB commented 1 year ago

Thanks for providing further details! Right, seems unrelated to your containers. Were the system containers enabled to be shown in that list? Do you have kubernetes enabled? (if so it could be caused by a kubernetes system container) Also a screenshot from nsenter/top as explained in the second debugging approach might help narrowing the problem further down.

BastLast commented 1 year ago

I do not have kubernetes enabled and the screenshot was taken just after enabling system containers, I'll see what I can do for nsenter but I'm unsure which container you want me to look into ?

Because if it is not related to my containers, why would I search for memory leaks inside of them ?

cimchd commented 1 year ago

Same problem here. I seems that the memory usage does not come from the containers. I had no running containers and the following resource usage:

image

The VmmemWSL consumed all available RAM (limited by .wslconfig).

Has anybody a solution or a better workaround than setting a RAM limit?

david-engelmann commented 1 year ago

@cimchd Are you running any commands after building the container?

d-fischer commented 1 year ago

I noticed that the WSL distro keeps racking up init processes. This is within my docker-desktop WSL shell, just about 5 minutes after restarting the whole thing to see if that fixes the issue. Maybe this is the cause of said memory leak.

# ps uxa
PID   USER     TIME  COMMAND
    1 root      0:00 /init
   16 root      0:00 /init
   17 root      0:00 /init
   18 root      0:00 wsl-bootstrap run --base-image /mnt/host/c/Program Files/Docker/Docker/resources/wsl/docker-for-wsl.iso --cli-iso /mnt/host/c/Program File
   26 root      0:01 /init
   27 root      0:05 /usr/bin/vpnkit-bridge --disable ssh-auth,osxfs-data,transfused,filesystem-event,filesystem-test,http-proxy-control --pid-file=/run/vpnkit
   42 root      0:00 [init]
   44 root      0:00 [init]
   48 root      0:00 unshare -muinpf --propagation=unchanged --kill-child=SIGTERM /usr/local/bin/wsl-bootstrap jump
   50 root      0:00 /sbin/init
   74 root      0:00 {rungetty.sh} /bin/sh /usr/bin/rungetty.sh
   76 root      0:00 /bin/login -f root
   84 root      0:00 /usr/bin/memlogd -fd-log 3 -fd-query 4 -max-lines 5000 -max-line-len 1024
   87 root      0:00 /usr/bin/logwrite -n procd /usr/bin/procd
   91 root      0:00 -sh
   97 root      0:00 /usr/bin/procd
  305 root      0:00 [init]
  347 root      0:00 /usr/bin/containerd
  373 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id acpid -address /run/containerd/containerd.sock
  419 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id allowlist -address /run/containerd/containerd.sock
  439 root      0:00 /allowlist
  466 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id artifactory -address /run/containerd/containerd.sock
  485 root      0:00 /artifactory-agent --docker-desktop-mode
  509 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id binfmt -address /run/containerd/containerd.sock
  557 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id container-filesystem -address /run/containerd/containerd.sock
  579 root      0:04 /usr/bin/container-filesystem
  616 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id devenv-service -address /run/containerd/containerd.sock
  649 root      0:00 /devenv-server -socket /run/guest-services/devenv-volumes.sock
  676 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id diagnosticsd -address /run/containerd/containerd.sock
  695 root      0:00 /usr/local/bin/diagnosticsd
  724 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id dns-forwarder -address /run/containerd/containerd.sock
  742 root      0:00 /usr/bin/dns-forwarder -dns.port 53 -conf /etc/coredns/Corefile
  774 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id docker -address /run/containerd/containerd.sock
  794 root      0:00 /usr/bin/docker-init /usr/bin/entrypoint.sh
  808 root      0:00 {entrypoint.sh} /bin/sh /usr/bin/entrypoint.sh
  812 root      0:00 /usr/bin/logwrite -n lifecycle-server /usr/bin/lifecycle-server
  817 root      0:00 /usr/bin/lifecycle-server
  845 102       0:00 /sbin/rpcbind -w
  862 root      0:00 /sbin/rpc.statd
  874 root      0:00 /usr/sbin/rpc.idmapd
  882 root      0:00 /usr/bin/logwrite -n containerd /usr/local/bin/containerd --config /etc/containerd/containerd.toml
  887 root      0:00 /usr/local/bin/containerd --config /etc/containerd/containerd.toml
  895 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id http-proxy -address /run/containerd/containerd.sock
  926 root      0:00 /http-proxy
  962 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id kmsg -address /run/containerd/containerd.sock
  981 root      0:00 /usr/bin/kmsg
  987 root      0:00 /usr/bin/logwrite -n dockerd /usr/local/bin/dockerd --containerd /var/run/desktop-containerd/containerd.sock --pidfile /run/desktop/docker
  992 root      0:53 /usr/local/bin/dockerd --containerd /var/run/desktop-containerd/containerd.sock --pidfile /run/desktop/docker.pid --swarm-default-advertis
 1029 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id nat -address /run/containerd/containerd.sock
 1091 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id rngd -address /run/containerd/containerd.sock
 1121 root      0:00 /sbin/rngd
 1153 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id sntpc -address /run/containerd/containerd.sock
 1163 root      0:00 [sh]
 1165 root      0:00 [init]
 1175 root      0:00 /start 30
 1203 root      0:00 /usr/sbin/sntpc -v -i 30 127.0.0.1
 1236 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id socks -address /run/containerd/containerd.sock
 1352 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id syn-filter -address /run/containerd/containerd.sock
 1392 root      0:00 /usr/bin/syn-filter -server-path /run/guest-services/nat-policy.sock
 1475 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id trim-after-delete -address /run/containerd/containerd.sock
 1512 root      0:00 /usr/bin/trim-after-delete -- /sbin/fstrim /var/lib/docker
 1527 root      0:00 [init]
 1569 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id volume-contents -address /run/containerd/containerd.sock
 1603 root      0:00 /usr/bin/volume-contents
 1658 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id vpnkit-forwarder -address /run/containerd/containerd.sock
 1677 root      0:00 /usr/bin/vpnkit-forwarder -data-connect /run/host-services/vpnkit-data.sock -data-listen /run/guest-services/wsl2-expose-ports.sock
 1708 root      0:00 /usr/bin/containerd-shim-runc-v2 -namespace services.linuxkit -id vpnkit-tap-vsockd -address /run/containerd/containerd.sock
 1727 root      0:00 /sbin/vpnkit-tap-vsockd --post-up-script /etc/network/post-up-script.sh --path /run/host-services/vpnkit.sock --tap eth0 --message-size 81
 1778 root      0:00 [init]
 1808 root      0:00 /sbin/vpnkit-tap-vsockd --post-up-script /etc/network/post-up-script.sh --path /run/host-services/vpnkit.sock --tap eth0 --message-size 81
 1827 root      0:00 [init]
 1829 root      0:00 [init]
 1831 root      0:00 [init]
 1833 root      0:00 [init]
 1835 root      0:00 [init]
 1837 root      0:00 [init]
 1839 root      0:00 [init]
 1841 root      0:00 [init]
 1843 root      0:00 [init]
 1845 root      0:00 /init
 1846 root      0:00 /init
 1847 root      0:00 -sh
 1848 root      0:00 [init]
 1850 root      0:00 [init]
 1853 root      0:00 [init]
 1862 root      0:00 [init]
 1864 root      0:00 [init]
 1866 root      0:00 [init]
 1877 root      0:00 [init]
 1879 root      0:00 [init]
 1882 root      0:00 [init]
 1885 root      0:00 [init]
 1887 root      0:00 [init]
 1889 root      0:00 [init]
 1891 root      0:00 [init]
 1893 root      0:00 [init]
 1895 root      0:00 [init]
 1897 root      0:00 [init]
 1899 root      0:00 [init]
 1901 root      0:00 [init]
 1903 root      0:00 [init]
 1905 root      0:00 [init]
 1907 root      0:00 [init]
 1909 root      0:00 [init]
 1911 root      0:00 [init]
 1913 root      0:00 [init]
 1915 root      0:00 [init]
 1917 root      0:00 [init]
 1920 root      0:00 [init]
 1922 root      0:00 [init]
 1924 root      0:00 [init]
 1926 root      0:00 [init]
 1928 root      0:00 [init]
 1930 root      0:00 [init]
 1932 root      0:00 [init]
 1934 root      0:00 [init]
 1936 root      0:00 [init]
 1938 root      0:00 [init]
 1940 root      0:00 [init]
 1942 root      0:00 [init]
 1944 root      0:00 [init]
 1946 root      0:00 [init]
 1948 root      0:00 [init]
 1950 root      0:00 [init]
 1952 root      0:00 [init]
 1954 root      0:00 [init]
 1957 root      0:00 [init]
 1959 root      0:00 [init]
 1961 root      0:00 [init]
 1963 root      0:00 [init]
 1965 root      0:00 [init]
 1967 root      0:00 [init]
 1969 root      0:00 [init]
 1971 root      0:00 [init]
 1973 root      0:00 [init]
 1975 root      0:00 [init]
 1977 root      0:00 [init]
 1979 root      0:00 [init]
 1981 root      0:00 [init]
 2006 root      0:00 [init]
 2017 root      0:00 [init]
 2019 root      0:00 [init]
 2031 root      0:00 [init]
 2034 root      0:00 [init]
 2036 root      0:00 [init]
 2038 root      0:00 [init]
 2040 root      0:00 [init]
 2047 root      0:00 [init]
 2049 root      0:00 [init]
 2051 root      0:00 [init]
 2054 root      0:00 [init]
 2056 root      0:00 [init]
 2058 root      0:00 [init]
 2060 root      0:00 [init]
 2062 root      0:00 [init]
 2064 root      0:00 [init]
 2066 root      0:00 [init]
 2068 root      0:00 [init]
 2070 root      0:00 [init]
 2075 root      0:00 [init]
 2077 root      0:00 [init]
 2079 root      0:00 [init]
 2081 root      0:00 [init]
 2085 root      0:00 [init]
 2087 root      0:00 [init]
 2089 root      0:00 [init]
 2091 root      0:00 [init]
 2093 root      0:00 [init]
 2096 root      0:00 [init]
 2098 root      0:00 [init]
 2100 root      0:00 [init]
 2102 root      0:00 [init]
 2104 root      0:00 [init]
 2106 root      0:00 [init]
 2108 root      0:00 [init]
 2112 root      0:00 [init]
 2114 root      0:00 [init]
 2116 root      0:00 [init]
 2118 root      0:00 [init]
 2120 root      0:00 [init]
 2122 root      0:00 [init]
 2124 root      0:00 [init]
 2126 root      0:00 [init]
 2128 root      0:00 [init]
 2130 root      0:00 [init]
 2132 root      0:00 [init]
 2134 root      0:00 [init]
 2136 root      0:00 [init]
 2138 root      0:00 [init]
 2140 root      0:00 [init]
 2142 root      0:00 [init]
 2144 root      0:00 [init]
 2146 root      0:00 [init]
 2148 root      0:00 [init]
 2150 root      0:00 [init]
 2152 root      0:00 [init]
 2154 root      0:00 [init]
 2156 root      0:00 [init]
 2158 root      0:00 [init]
 2160 root      0:00 [init]
 2162 root      0:00 [init]
 2164 root      0:00 [init]
 2166 root      0:00 [init]
 2168 root      0:00 [init]
 2170 root      0:00 [init]
 2172 root      0:00 [init]
 2174 root      0:00 [init]
 2176 root      0:00 [init]
 2178 root      0:00 [init]
 2180 root      0:00 [init]
 2182 root      0:00 [init]
 2185 root      0:00 [init]
 2187 root      0:00 [init]
 2189 root      0:00 [init]
 2191 root      0:00 [init]
 2193 root      0:00 [init]
 2195 root      0:00 [init]
 2197 root      0:00 [init]
 2199 root      0:00 [init]
 2201 root      0:00 [init]
 2203 root      0:00 [init]
 2205 root      0:00 [init]
 2207 root      0:00 [init]
 2209 root      0:00 [init]
 2211 root      0:00 [init]
 2215 root      0:00 [init]
 2217 root      0:00 ps uxa

My other WSL distro does not do this.

d-fischer commented 1 year ago

Just updated to 4.18.0 (104112) and what I wrote above seems to not happen anymore.

Not sure if it's related to the initial issue, but you might want to re-check.

domzim commented 1 year ago

I'm having the same issues after updating docker desktop from 4.10 to 4.18.0 (104112). There are no containers running, but 2 minutes after starting docker desktop the ram limit of 8GB, which I configured, is reached.