Open karlshea opened 2 years ago
Facing the same issue with a Ruby on Rails project. Moving away from NFS mount removes the problem but we would need NFS for the project to work efficiently.
FYI:
adding ,wsize=32768,rsize=3276
to our docker NFS mount options seems to fix this issue.
nfsmount_xdebug:
driver: local
driver_opts:
type: nfs
o: addr=host.docker.internal,rw,nolock,hard,nointr,nfsvers=3,wsize=32768,rsize=3276
device: ":${PWD}/xdebug"
edit:
unexpected but the protocol maximum also seem to work wsize=65536,rsize=65536
Following @noud-github's suggestion, adding wsize
and rsize
options fixes the issue for me as well.
Although I went with ,wsize=32768,rsize=32768
(versus 3276
for rsize).
I think wsize
/rsize
might just be masking the problem. Trying an md5sum
on a 100MB zip file through Docker still hangs, while it succeeds on a normal NFS mount on another Mac.
@karlshea using wsize=65536,rsize=65536
i can do a md5sum 100MB.zip
on that NFS share in docker,
did you remove the "NFS Volume" seen with docker volume list
after editing the options?
if you don't (or forgot like I did the first time) the new setting are not applied.
@noud-github You're right, I didn't! wsize=32768,rsize=32768
does indeed fix it for me. It looks like 32768 is the default for recent distros, so I'm curious what the Docker driver is using instead.
I still believe this is covering up a deeper issue (why are smaller values breaking?), but at least it's fixing the immediate problem.
@karlshea I think you are right in assuming this is just covering up a deeper issue. the fact that this is not a issue on two system pointed me in the direction to try this solution in the first place, Because when you connect to another system, you use two "real" networks stacks, inc buffers sizes etc. but if you run this on you local system you use "local interface" and this is not the first time i have had "unexpected" behavior when only using "local interface", So my guess would be that ventura has some "bug" or feature in the Local interface stack, that is triggered when not setting the wsize and rsize in NFS
macOS defaults (from man mount_nfs
) are 8192 for UDP mounts and 32768 for TCP mounts.
Additional notes from wsize
param: "Note that both the rsize and wsize options should only be used as a last ditch effort at improving performance when mounting servers that do not support TCP mounts."
nfsstat -m
for a Mac-to-Mac NFS mount using all default options (mount -t nfs server-mac:/server-path directory
):
General mount flags: 0x4000018 nodev,nosuid,multilabel NFS parameters: vers=3,tcp,port=2049,nomntudp,hard,nointr,noresvport,negnamecache,callumnt,locks,quota,rsize=32768,wsize=32768,readahead=16,dsize=32768,rdirplus,nodumbtimer,timeo=10,maxgroups=16,acregmin=5,acregmax=60,acdirmin=5,acdirmax=60,nomutejukebox,nonfc,sec=sys
I tried to find defaults by mounting with no options other than addr
:
volumes:
nfsmount-repo:
driver: local
driver_opts:
type: nfs
o: "addr=host.docker.internal"
device: ":/Users/karl/Sites/nfs-test"
Then got into the Docker VM using justincormack/nsenter1
and ran mount
:
/Users/karl/Sites/nfs-test on /var/lib/docker/volumes/nfs-test_nfsmount-repo/_data type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.65.2,mountvers=3,mountproto=tcp,local_lock=none,addr=192.168.65.2
Which is pretty strange, since our fixes actually seem to be setting the values lower. man mount_nfs
says
The default read and write sizes are 8K when using UDP, and 32K when using TCP. Values over 16K are only supported for TCP, where 2M is the maximum.
Any value over 32K is unlikely to get you more performance, unless you have a very fast network.
If the network interface cannot handle larger packet sizes or a long train of back to back packets, you may see low performance figures or even temporary hangups during NFS activity.
This seems to possibly point to the root cause.
looks like docker is trying to use a pretty high default package size, that in combination with:
If the network interface cannot handle larger packet sizes or a long train of back to back packets, you may see low performance figures or even temporary hangups during NFS activity.
that makes a solid case for using a lower package size, as we found ventura's "local interface" doing just that. makes you wonder what they chanced there ;-)
Those sizes are supposed to be powers of 2 according to the man page. I tested up to 262144 (md5sum
hangs), the biggest that worked was 131072.
Oh I forgot - I had tested this problem with Colima in https://github.com/drud/ddev/issues/4122#issuecomment-1272648461 - So this is not strictly a Docker Desktop issue I don't think.
So this is not strictly a Docker Desktop issue I don't think.
i am using ranger desktop so I can concur on that.
The wsize=8192,rsize=8192
fix my issue too.
@noud-github
Fuck this! I spent the whole day trying to find a solution to this problem. Thank you very much.
Could you explain how you came up with the solution and how you investigated the problem? Thank you.
Could you explain how you came up with the solution and how you investigated the problem? Thank you.
All of the investigation is in this issue and drud/ddev#4122. I believe all of us looking into it thought we were raising the Docker defaults to fix the problem, but it turns out we were lowering them.
FWIW: I'm seeing nfsd send error 40
and md5sum bigfile
hanging with Ventura NFS server and Raspberry Pi clients over WiFi (no Docker involved). Was reliable before Ventura. So this looks like a macOS problem.
wsize=65536,rsize=65536
works for me.
Exactly what @Carpenter0100 says ;). How did you come up with those params @noud-github ?
The recently released Ventura 13.1 now has issues with wsize=32768,rsize=32768
and fail to use them:
nfssvc_addsock: socket buffer setting error(s) 22
Error code 22 is EINVAL.
Bumping the the NFS socket buffer to 65536
solves the issue for me.
Same issue with a Magento 2 project using huge amount of Composer dependencies.
Using wsize=65536,rsize=65536
fixed the issue.
Once YAML file is updated, do not forget to:
@Krilo89
actually, it was down to experience in DevOps for more than 2 decades , cannot find the git issue, but there was a vertura/docker/nfs issue where someone mentioned that running NFS on one laptop and docker on the other did not have the issue. That reminded me of a 2 decade old issue on windows where the mtu size was not respected/applied by the local interface (lo) breaking iSCSI on a local system so I actually only googled on NFS and packet size to find the solution
[edit]the working cross mac came from this tread: https://github.com/drud/ddev/issues/4122#issuecomment-1294862469
@Carpenter0100 see above
There hasn't been any activity on this issue for a long time.
If the problem is still relevant, mark the issue as fresh with a /remove-lifecycle stale
comment.
If not, this issue will be closed in 30 days.
Prevent issues from auto-closing with a /lifecycle frozen
comment.
/lifecycle stale
/remove-lifecycle stale /lifecycle frozen
There are workarounds, but with out of the box defaults it's broken. If anyone from the Docker org bothered replying to shed any light on this situation maybe it could move more towards "fixed".
Has anybody encountered a drastic decrease in performance (both r/w) of nfs mounts in 13.4? Installing our CI takes over ~40 minutes instead of ~5 (both M1 and M1 Pro)
I was debugging if the docker version caused this, however with another Mac with 13.3.1 no performance issues were noticeable.
To all of you, do NOT update to 13.4! The VirtIO FS does not have any performance issues but this problem #6820 seems to be present.
Expected behavior
Everything works normally.
Actual behavior
Container will hang. If caught within a second or two, ^C can quit the container if it's running in the foreground (
docker compose up
). Otherwise Docker itself can hang to the point where the process needs to be killed.nfsd send error 40
also appears in the MacOS Console.Information
Output of
/Applications/Docker.app/Contents/MacOS/com.docker.diagnose check
(Pi-hole is blocking api.segment.io)
Steps to reproduce the behavior
cat
/md5sum
: https://github.com/drud/ddev/issues/4122#issuecomment-1294862469_Plain NFS access from another Mac seems to work normally.
Related issue: drud/ddev#4122