docker / roadmap

Welcome to the Public Roadmap for All Things Docker! We welcome your ideas.
https://github.com/docker/roadmap/projects/1
Creative Commons Zero v1.0 Universal
1.45k stars 244 forks source link

[Docker Desktop] Improve Mac File system performance #7

Closed nebuk89 closed 2 months ago

nebuk89 commented 4 years ago

Update Feb 6, 2024 - Released as part of Docker Desktop 4.27 - https://www.docker.com/blog/announcing-synchronized-file-shares/

Update Nov 9, 2023 - As announced in June, Docker has acquired Mutagen IO, Inc.. We are hard at work integrating it into Docker Desktop and working to roll it out as part of a limited early access program.

Update: we are now looking at using GRPCFuse rather than mutagen as a simpler path for perf improvement.

Tell us about your request Integrate the mutagen pluggin within Docker Desktop to provide users with a file caching option to improve performance on modern web frameworks like PHP Symphony

Which service(s) is this request for? Desktop

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? File system performance is big issue on Mac,our goal improve web page refresh for web languages like PHP from 2.8 seconds to 0.2 seconds

Are you currently working around this issue? N/A Additional context N/A Attachments https://github.com/docker/for-mac/issues/77

daveisfera commented 4 years ago

These issue are still happening, so can they be added to this or new cards made to include them? https://github.com/docker/for-mac/issues/1592 https://github.com/docker/for-mac/issues/1899 https://github.com/docker/for-mac/issues/2417 https://github.com/docker/for-mac/issues/3186

leehambley commented 4 years ago

I hope leaving a note here is not unwelcome to highlight a particular difficulty in booting Rails applications in Docker on macOS.

Between Rails hot code reloading, which has a semi-hard dependency on working filesystem events (the polling alternative can eat as much as 20% of system CPU) and Bootsnap, which is a caching loader which tries to speed up booting Rails by bypassing the parsing step (and caching compiled btyecode on disk) which means that booting a Rails app with bootsnap loaded writes as many file as it reads (sometimes), but other times (e,g warm start) has a very different read/write ratio.

None of the existing methods account for this very well, even the relatively new mount options to try and performance boost either reads or writes don't particularly help because of the relative symmetry under certain configurations.

Most places I know are still using NFS, with polling, and trying to tune the fsattr cache times to get a reasonable balance of developer experience (speed of code reloading) vs. performance (outright speed), but it's a tightrope.

nebuk89 commented 4 years ago

hey @daveisfera we haven't got this quite out onto edge yet :) once we do we will move this over to developer/preview and update this ticket. Hopefully then people on those tickets should see some movement!

mbrioski commented 4 years ago

@nebuk89 could u clarify caching? Because at the moment the problem is the slow sync from host machine to docker containers

Jean85 commented 4 years ago

@mauri-brioschi With PHP/Symfony apps sometimes the reverse happens: the app itself rewrites a bunch of files in its own var/cache dir, where it dumps a lot of stuff (container definitions, configuration...) and this slows a lot the machine. You could try to ignore that folder from the sync, but you would lose some IDE functionality that relies on that folder.

mbrioski commented 4 years ago

@Jean85 i know the problem and it happens more with "Frameworks" like Magento where there is "generated" code. Usually i exclude these directories so there's not need for sync. Caching these directories really solve the problem? At the end if u flush the cache and u are synchronizing your volumes, your folder must be sync again.

markshust commented 4 years ago

Artifacts and caching directories like var/cache, generated, node_modules, vendor, etc. should basically persist on the container, however the issue that @Jean85 pointed out that you miss IDE functionality is definitely valid. As a developer you want assets like these to be generally available, and don't want to have to worry about this folder becoming "out of sync" with the host. I wrote a blog post about all of this at https://markshust.com/2018/12/30/docker-mac-filesystem-volume-mount-approach-performance/ which may be useful to others dealing with filesystem performance issues. Magento 2 seems to be an ideal use-case to test the folder sync issues.

Mutagen offers different modes https://mutagen.io/documentation/synchronization and ideally we should have access to all of these modes. I believe the "one way" is more performant than "two way" since it only needs to check the filesystem changes in one direction. We'd also need to control whether the host or container is alpha or beta. Consistency also probably isn't required for these artifacts directories, it just needs to be "generally there". Being able to control volume mounts on these different modes is crucial to achieving maximum performance.

mbrioski commented 4 years ago

What's about mounting NFS volumes? It is the only working solution i have found, i.e. for Magento2. No configuration is required and you can still use delegated, cache ...

markshust commented 4 years ago

@mauri-brioschi osxfs is (much) more performant than nfs in my experience. i haven't had any issues with osxfs as long as i am strategic in the volumes i mount.

strayer commented 4 years ago

I'd like to add that in our case it is important to not sync folders like node_modules or Elixirs build and deps folders back to the host especially because of IDE support. Elixirs folders contain architecture-specific compiled files that will break IDE functionality because many plugins need to execute these files. While file system performance is one of our most prominent Docker dev issues, the requirement of being able to exclude folders from syncing is very important too.

The obvious workaround would be to simply mount another volume on top of the shared app volume for node_modules and such, but we encountered strange issues with that too. There are filed issues for those problems, but I can't find them right now. I think the issue was the the volumes for node_modules and so on suddenly didn't mount anymore.

rfay commented 4 years ago

Just a note that all the docker-sync approaches we've tried have failed, and I'd be very suspicious of mutagen. Although the performance was great, the reliability was zero.

mbrioski commented 4 years ago

@markshust i'm following this issue https://github.com/docker/for-mac/issues/1592 since a while, and NFS is definitely not slower than Osxfs... we are using it in all Magento2 projects and performances are not comparable at all. We save minutes with NFS.

cweagans commented 4 years ago

@mauri-brioschi it depends on what you're using it for. For PHP projects that all have a zillion tiny files that need to be read every time you send a request, osxfs is definitely not as good. IIRC, for small numbers of files, it does pretty well.

ifeltsweet commented 4 years ago

We've tried mutagen on macOS and it worked well but we've experienced syncing issues with Windows. Eventually, we chose not to use mutagen in our team since the difference in setups between Windows and macOS slowed us down more (in terms of maintenance) than a less performant "delegated" mode.

Hopefully, a native integration will make this as simple as adding a flag to your volume.

ajardin commented 4 years ago

I'm very excited to have a native implementation of Mutagen within Docker Desktop, as my company and I have been using it for over a year!

We are working on both Magento and Symfony, so we are heavily impacted by the famous file system performance issues... We tried several approaches (Docker Machine NFS, docker-sync, exclude cache/vendors from synchronization, etc.), but the most successful by far is Mutagen, at least for us.

Although having to use a third party solution in addition to Docker is still an issue from my point of view.

nebuk89 commented 4 years ago

We've tried mutagen on macOS and it worked well but we've experienced syncing issues with Windows. Eventually, we chose not to use mutagen in our team since the difference in setups between Windows and macOS slowed us down more (in terms of maintenance) than a less performant "delegated" mode.

Hopefully, a native integration will make this as simple as adding a flag to your volume.

This is exactly the experience we want! You should be able to check a volume in the Desktop settings and get it cached :) Initially this will just be released on Mac (as our first preview) due to the point some of you have highlighted that behaviour in Windows differs. Our end goal is to ideally hide this difference so it 'just works with a check box'.

nebuk89 commented 3 years ago

This is now out on Docker Desktop Edge, we are after feedback on the UX, the functionality and the performance!

jasonwilliams commented 3 years ago

For those of you having permission issues since switching to Edge there's an issue here: https://github.com/docker/for-mac/issues/4593

nebuk89 commented 3 years ago

@jasonwilliams acknowledged, Dave has looked at the issue and we are looking :)

DanielSchwiperich commented 3 years ago

Tested a symfony app

Before Edge:

with no vendor and var overlay volume ~ 2.8s with overlays: ~ 2.1s

Edge with caching activated for project folder:

with no vendor and var overlay volume ~ 1.8s with overlays: ~ 1.7s

Good results 👍

petr-ujezdsky commented 3 years ago

I have been monitoring IO performance for some time and for that I have created a simple IO testing project available at my github. It might be useful for you too

https://github.com/petr-ujezdsky/docker-io-test

nebuk89 commented 3 years ago

@djs55 ☝️

For others: https://docker.events.cube365.net/docker/dockercon/content/Videos/92BAM7vob5uQ2spZf This is a video of @djs55 giving an. overview of the changes we have made and how the file system also works on Windows :)

leehambley commented 3 years ago

Here's the same video via YouTube if you prefer to watch without giving your contact information to Docker's commercial team:

ghost commented 3 years ago

@leehambley nice one, thanks!!

andig commented 3 years ago

Please allow me to highlight https://github.com/docker/for-mac/issues/3499 and https://github.com/docker/for-mac/issues/3499#issuecomment-623960890 which suspects a concrete stuck loop in https://github.com/moby/hyperkit/blob/79c6a4d95e3f8a59f774eb66e3ea333a277292c6/src/lib/mirage_block_ocaml.ml#L422. Would this be replaced by mutagen or should we open a new issue here to track progress on this assumption?

Update sorry, think I misread this issue. It is apparently about caching, not about fixing the infamous 100% CPU issue. That is in https://github.com/docker/roadmap/issues/12.

nebuk89 commented 3 years ago

@andig closing off this chat on issue #12 this has been acknowledge and started internal discussions in relation to this issue :)

stephen-turner commented 3 years ago

For everyone following this ticket, we'd love you to try out our latest Edge release, 2.3.5.0. You can get it at https://desktop.docker.com/mac/edge/47376/Docker.dmg.

The largest change is that (by default) it uses gRPC-FUSE instead of osxfs. This brings a big improvement in file sharing speed (as well as CPU usage). So if you're interested in this issue, we'd love to get your feedback on it.

markshust commented 3 years ago

@stephen-turner for which volume mounts (delegated, cached, both, or all?) does that filesystem update apply to?

djs55 commented 3 years ago

@markshust the new file sharing implementation applies to all volume mounts irrespective of consistency flags. It should be particularly helpful for scenarios where there are lots of files changing on the host (for example uncompressing a large archive or switching git branches).

stephen-turner commented 3 years ago

@djs55 I'm embarrassed not to know this, but are there any differences between consistent, cached and delegated if you are using gRPC-FUSE?

yos1p commented 3 years ago

I updated my Docker for Mac to version 2.3.5.0 on Edge channel, and I notice significant performance impact from Mutagen to FUSE server. This makes my PHP app almost as slow as legacy osxfs file sharing. Anyone know why or probably some workaround to solve this?

hbouhadji commented 3 years ago

@yos1p could you try this and post your results here, just for informations:

without mutagen

docker run -it -v /private/tmp/www:/var/www alpine time dd if=/dev/zero of=/var/www/test.dat bs=1024 count=100000

with mutagen (delegated flag)

docker run -it -v /private/tmp/www:/var/www:delegated alpine time dd if=/dev/zero of=/var/www/test.dat bs=1024 count=100000

I'm still with the v2.3.4.0 and my results are:

# without mutagen
100000+0 records in
100000+0 records out
real    0m 18.51s
user    0m 0.15s
sys 0m 2.42s

# with mutagen
100000+0 records in
100000+0 records out
real    0m 0.35s
user    0m 0.05s
sys 0m 0.29s
yos1p commented 3 years ago

I've enabled gRPC FUSE for file sharing (checkbox selected), and then make sure that /private is available in File Sharing to bind.

docker run -it -v /private/tmp/www:/var/www alpine time dd if=/dev/zero of=/var/www/test.dat bs=1024 count=100000

100000+0 records in
100000+0 records out
real    2m 16.33s
user    0m 0.90s
sys 0m 8.30s

With gRPC FUSE for File Sharing unselected:

100000+0 records in
100000+0 records out
real    0m 46.76s
user    0m 0.48s
sys 0m 2.77s
itaylor commented 3 years ago

@stephen-turner @djs55 is the new gRPC FUSE file system something that can be open sourced? There are a lot of talented engineers who use D4M who might be able to help improve its performance if they could build it locally. I think a lot of the frustration with osxfs on the part of the community was from our inability to address its shortcomings/bugs without someone from Docker needing to do all the work.

itaylor commented 3 years ago

Here's some results of using various recent versions/configurations of Docker Desktop for Mac Edge. These are run against a real-world large Node.js/webpack project on a real-world usage scenario that developers on the project do somewhat frequently, a "clean build" that removes node_modules, re-installs all dependencies, and rebuilds the codebase from source. These are run on a 2019 Macbook Pro with 12 logical 6 physical CPU cores and 16GB ram. Docker is allocated 8 cores and 12GB ram.

As you can see below, the slowest option is the new gRPC FUSE option from the current edge build. It is ~1.5x slower than osxfs on the same version, and ~15x slower than Native Mac OS. It took 30 minutes to run something that takes 2 minutes on native Mac OS. At first I didn't believe the new gRPC FUSE implementation would be slower than osxfs, so I ran the gRPC FUSE option several times and got similar results each time. It is definitely slower than osxfs on this use case.

The Mutagen option from the previous edge build took half the time of the :cached on the same version of Docker Desktop, and was the best performing docker version, taking 4x longer than native, although when I checked the docker preference pane after the test, it displayed "Error" for that mount, I think indicating that the filesystem was no longer in sync between Mac and Linux, so perhaps it's unfair to count that as having "worked".

I do notice moderately reduced CPU usage when docker is idle with gRPC FUSE vs osxfs. When under load the CPU usage seems roughly similar with either com.docker.osxfs or com.docker.backend showing ~90-150% in Activity Monitor.

Here's the data:

Docker Desktop Mac 2.3.4.0 (46980) OSXFS no cache/no mutagen sync

$ docker run -it -v $(pwd):/src -w /src node:12 /bin/bash -c "time ( rm -rf node_modules && yarn && yarn build)"
real    20m23.226s
user    5m53.208s
sys 5m58.094s

Docker Desktop Mac 2.3.4.0 (46980) OSXFS cached/no mutagen sync

$ docker run -it -v $(pwd):/src:cached -w /src node:12 /bin/bash -c "time( rm -rf node_modules && yarn && yarn build)"
real    19m42.609s
user    7m2.639s
sys 6m26.999s

Docker Desktop Mac 2.3.4.0 (46980) OSXFS delegated/with mutagen sync (note: upon finishing mutagen shows "Error" in File Sharing dashboard, but code built successfully?)

$ docker run -it -v $(pwd):/src:delegated -w /src node:12 /bin/bash -c "time( rm -rf node_modules && yarn && yarn build)"
real    8m41.692s
user    7m29.828s
sys 6m7.284s

Docker Desktop Mac 2.3.5.0 (47376) gRPC FUSE enabled in General preferences

$ docker run -it -v $(pwd):/src:delegated -w /src node:12 /bin/bash -c "time( rm -rf node_modules && yarn && yarn build)"
real    30m1.625s
user    8m33.623s
sys 8m6.792s

Docker Desktop Mac 2.3.5.0 (47376) gRPC FUSE disabled in General preferences, presumably using "cached" semantics because :delegated is not supported by OSXFS

$ docker run -it -v $(pwd):/src:delegated -w /src node:12 /bin/bash -c "time( rm -rf node_modules && yarn && yarn build)"
real    20m40.508s
user    7m41.085s
sys 6m56.079s

Baseline: native Mac OS: Node 12.18.3

$ time (rm -rf node_modules && yarn && yarn build)
real    2m24.297s
user    2m4.654s
sys 2m37.918s
cweagans commented 3 years ago

I don't understand why gRPC or FUSE are in the mix at all. Why is the Docker team continuing to invent new solutions to old, mostly solved problems? Plain old NFS with an FS event forwarder + maybe cachefilesd if you want to get fancy outperforms both osxfs and gRPC FUSE as far as I can tell. It worked just fine back in the Vagrant days and is a sane default for the product IMO.

mtibben commented 3 years ago

Our whole team still uses docker-machine and NFS for this reason @cweagans. It just works

leehambley commented 3 years ago

It should be pointed out that NFS does not in-fact "just work", unless I'm missing something you have to configure the host exports manually outside Docker, and then trade-off the caching, attribute sync, and other properties to achieve a reasonable performance. Perhaps (as I suspect) attribute sync disabled, plus an event forwarder is a decent approximation of a fast, reliable filesystem but it's no silver bullet.

On Wed, 26 Aug 2020 at 07:27, Michael Tibben notifications@github.com wrote:

Our whole team still uses docker-machine and NFS for this reason @cweagans https://github.com/cweagans. It just works

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/docker/roadmap/issues/7#issuecomment-680663908, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEUCGZT5IUJU35NCSR32DSCSMMXANCNFSM4LC5WHNQ .

FernandoMiguel commented 3 years ago

It should be pointed out that NFS does not in-fact "just work", unless I'm missing something you have to configure the host exports manually outside Docker, and then trade-off the caching, attribute sync, and other properties to achieve a reasonable performance. Perhaps (as I suspect) attribute sync disabled, plus an event forwarder is a decent approximation of a fast, reliable filesystem but it's no silver bullet.

@leehambley all I had to do in the past was restart nfs on the host mac and that was it. Had a custom override docker compose for the binds to be nfs vs local binds on Linux. Was much faster than anything available at the time, with the benefit no need for initial sync like mutagen does. Worked mostly problem free for years with 30 devs. Only one would have occasional issues shifting git branches on the host and blow up the container running

leehambley commented 3 years ago

So we run a similar setup, we have 20+ devs, and a couple of them use linux as the host OS. We have to run custom docker-compose.dev..yml files which are loaded on top of the docker-compose.dev.yml files, which, for the mac users redefine the NFS mounts.

In CI, and on platforms with proper sharable filesystems, we don't want to run NFS, so we need to maintain a dual config. We also run with this config that one of our cherished elders once devised a lifetime ago, I'm not sure what all the knobs and dials actually do (I think the last two options are trying to compensate for lack of filesystem events)

volumes: nfsmount: driver: local driver_opts: type: nfs o: addr=host.docker.internal,rw,nolock,hard,nointr,nfsvers=3,acregmin=1,acdirmin=1 device: ":${PWD}"

On Wed, 26 Aug 2020 at 12:36, Fernando Miguel notifications@github.com wrote:

It should be pointed out that NFS does not in-fact "just work", unless I'm missing something you have to configure the host exports manually outside Docker, and then trade-off the caching, attribute sync, and other properties to achieve a reasonable performance. Perhaps (as I suspect) attribute sync disabled, plus an event forwarder is a decent approximation of a fast, reliable filesystem but it's no silver bullet.

On Wed, 26 Aug 2020 at 07:27, Michael Tibben notifications@github.com wrote:

Our whole team still uses docker-machine and NFS for this reason @cweagans https://github.com/cweagans https://github.com/cweagans. It just works

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub

7 (comment)

https://github.com/docker/roadmap/issues/7#issuecomment-680663908, or unsubscribe

https://github.com/notifications/unsubscribe-auth/AAAEUCGZT5IUJU35NCSR32DSCSMMXANCNFSM4LC5WHNQ .

@leehambley https://github.com/leehambley all I had to do in the past was restart nfs on the host mac and that was it. Had a custom override docker compose for the binds to be nfs vs local binds on Linux. Was much faster than anything available at the time, with the benefit no need for initial sync like mutagen does. Worked mostly problem free for years with 30 devs. Only one would have occasional issues shifting git branches on the host and blow up the container running

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/docker/roadmap/issues/7#issuecomment-680800297, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEUCHI7HLSVSG2AGOXZOLSCTQUJANCNFSM4LC5WHNQ .

markshust commented 3 years ago

The disparity in this thread is telling that no one is quite satisfied with any of the current filesystem methods. The open discussion here is great, but building in every option & solution into the Docker UI (aka making everybody happy) is a recipe for disaster. There is nothing preventing anyone from using NFS, FUSE, or Mutagen outside of the built-in GUI, today.

Perhaps D4M can make it easy to add any of these methods (and more, and any new possibile future disk options) into the existing GUI in the form of an add-on or plugin?

If this thread continues going much longer like this (or perhaps it's already there), this open discussion is going to turn anti-productive. The blanket "Improve Mac File system performance" title of this ticket is not well defined, and doesn't/won't lead to any actionable outcome which will make any majority satisfied with any outcome of this ticket.

itaylor commented 3 years ago

I agree with @markshust, all of the (current) FS options are non optimal for certain use cases. A plugin based architecture is what's needed, where developers can choose the FS option that fits their use case. Ideally, each File System plugin could be open sourced separately and evolve and improve on its own. We would then be able to have the debate about how to improve each of the file systems within its own repository, where the conversation is productive and focused, instead of having the debate here where it's one FS vs another, with the loser being removed from the product.

cweagans commented 3 years ago

@leehambley I would submit that modifying an existing text file on disk and telling nfsd to restart is a very low lift from an engineering standpoint (especially compared to implementing an entire new filesystem or re-inventing NFS but with gRPC and FUSE). Plain NFS with no tweaking to attribute sync, caching, etc performs better than osxfs at this point. The vast majority of NFS setups on vagrant were just NFS and it made a huge difference compared to vboxfs.

It may not be the perfect solution for every use case, but I would strongly assert at this point that it's a better default for the Docker for Mac product than anything else that has been proposed.

mtibben commented 3 years ago

It should be pointed out that NFS does not in-fact "just work", unless I'm missing something you have to configure the host exports

Hey @leehambley we use https://github.com/adlogix/docker-machine-nfs which configures NFS seamlessly for us

strayer commented 3 years ago

@mtibben I just tried the whole NFS thing for Docker for Mac, not docker-machine, and while it was relatively easy to setup (used this guide) it doesn't support fs events out of the box, as @leehambley mentioned.

This is a huge deal for a lot of frameworks that rely on fs change events by default.

mtibben commented 3 years ago

Yeah for sure @Strayer there's a tradeoff. Performance is also a huge deal, and there's no solution that gives you everything at the moment

GrahamCampbell commented 3 years ago

mutagen was really great. npm is basically unusable without it.

manuelmeister commented 3 years ago

Is it gone from Edge? Docker is back to being unusable slow.

doublesharp commented 3 years ago

@manuelmeister I just had that happen to be after updating to Docker Desktop 2.4.0.0, not using Edge. Going into the preferences and disabling gRPC FUSE and going back to osxfs made it usable again for me.

See https://github.com/docker/for-mac/issues/4953

coredumperror commented 3 years ago

I, too, find that the new gRPC FUSE file system is significantly less performant than the osxfs system.

I was so excited to finally have halfway decent performance in the Wagtail admin without template cacheing, but sadly, using gRPC FUSE makes page loads take twice as long compared to osxfs. 6 second load times per request suck, but 12 second load times suck a lot more.