docker / for-mac

Bug reports for Docker Desktop for Mac
https://www.docker.com/products/docker#/mac
2.42k stars 116 forks source link

File system performance improvements #1592

Open yallop opened 7 years ago

yallop commented 7 years ago

Recent Docker versions (17.04 CE Edge onwards) add additional flags to the -v option for docker run that make it possible to specify the consistency requirements for a bind-mounted directory. The flags are

In 17.04 cached is significantly faster than consistent for many workloads. However, delegated currently behaves identically to cached. The Docker for Mac team plans to release an improved implementation of delegated in the future, to speed up write-heavy workloads. We also plan to further improve the performance of cached and consistent.

We will post updates relating to Docker for Mac file sharing performance in comments to this issue. Users interested in news about performance improvements should subscribe here.

To keep the signal-to-noise ratio high we will actively moderate this issue, removing off-topic advertisements, etc. Questions about the implementation, the behaviour, and the design are very welcome, as are realistic benchmarks and real-world use cases.

joemewes commented 7 years ago

A rudimentary real-worked example (Drupal 8):

Docker for Mac : Version 17.05.0-ce-rc1-mac8 (16582) Mac-mini :: MacOS Sierra : 10.12.2 (16C67)

A simple command line curl test (taken average of 10 calls to URL) Drupal 8 clean install frontend:

old UNISON (custom synced container approach) Volume mount : 0.470s standard Volume mount: 1.401s new :cached Volume mount: 0.490

v.easy implementation to add to my compose.yaml files and happy with any delay between host/output using cached on host codebase.

@yallop Is there a rough/expected release date for 17.04 (stable) ?

yallop commented 7 years ago

@yallop Is there a rough/expected release date for 17.04 (stable)

The next stable version will be 17.06, and is likely to be released some time in June. There's been a change to the Docker version numbering scheme recently so that the numbers now indicate the release date, with stable releases every three months (March, June, September, December), and edge releases in the months in between. For example, 17.04 is the Edge release in April 2017 and 17.06 is the stable release in June 2017.

joemewes commented 7 years ago

ok, thanks for that. do you expect 17.06 to contain at least the current edge implementation of :cached?

yallop commented 7 years ago

Yes, that's the plan.

reinout commented 7 years ago

How should this work in a docker-compose.yml file? I tried appending :cached to the existing volume setting:

volumes:
  - .:/srv:cached

... but that got me an error (on OSX):

ERROR: Cannot start service my_service: Mounts denied: 9p: Stack overflow

(Docker version 17.05.0-ce-rc1, build 2878a85)

Note: having a global setting or environment variable to switch my local default to "cached" would also be fine (or rather preferable).

DanielSchwiperich commented 7 years ago

The syntax looks correct @reinout here's a stripped but working example docker-compose.yml

version: '2'
services:
  php:
    image: php:7.1-fpm
    ports:
      - 9000
    volumes:
      - .:/var/www/project:cached

tested on Docker version 17.05.0-ce-rc1, build 2878a85

reinout commented 7 years ago

Your example works. My own one still not (even after really making sure there were no left-over old mounted volumes). So I rebooted. Afterwards, it worked.

So: a reboot might be needed if you've run a docker-compose before upgrading docker to the latest version. Possibly related: I switched from docker stable to edge.


Is there a possibility of a global setting? I don't really want to add this option to the docker-compose.yml that all my linux colleagues are using.

DanielSchwiperich commented 7 years ago

not as far as I know. When you linux colleagues are running edge the flag should work.

Another workaround would be (that's what we do right now for separating linux and mac volume mounting ) to just put the mount settings for mac in a separate file, like docker-compose-mac.yml and then run docker-compose -f docker-compose.yml -f docker-compose.mac.yml up -d

See https://docs.docker.com/compose/extends/

carn1x commented 7 years ago

@reinout You could use an additional compose file to override that of your colleagues? For instance, we have:

docker-compose.yml:

version: '2'

services:

  build_docs:
    image: docs/sphinx
    build: .
    environment:
      - DOCS_NAME='docs'
      - SRC_DIR=src
      - DST_DIR=build
    volumes:
      - "./../../docs/dev:/docs"
    command: /docs/source /docs/build

and volumes-cached.yml:

services:
  build_docs:
    volumes:
      - "./../../docs/dev:/docs:cached"

Which can be run with:

$ docker-compose -f docker-compose.yml -f volumes-cached.yml up
reinout commented 7 years ago

Yes, I could do that. But.... I'd have to do that for each of the 12 docker-compose projects. And I'd have to keep it in sync with changes to the "master" docker-compose.yml.

As an intermediary measure: fine. Long term: no, as it is not very don't-repeat-yourself :-)

If someone wants to enable the "cached" behaviour, that person probably wants to use it for all/most of the dockers. Would it make sense as a config setting in the Docker app itself? In the preferences' "File sharing" tab? (This should probably be its own ticket, I assume?)

yallop commented 7 years ago

@reinout: :cached is supported across platforms, so there should be no issue in adding it directly to compose files.

reinout commented 7 years ago

Provided that everybody uses the latest edge version, right? And it seems a bit strange to add an option that only has effect on osx to everybody's docker-compose.yml.

Anyway, it works for now. I won't drag the signal/noise ratio further down :-)

matthewjosephtaylor commented 7 years ago
mtaylor(mjt)@mtaylor:~/tmp/docker-disk-perf-test$ time dd if=/dev/zero of=./output bs=8k count=40k; rm ./output
40960+0 records in
40960+0 records out
335544320 bytes transferred in 1.300857 secs (257941007 bytes/sec)

real    0m1.320s
user    0m0.012s
sys 0m0.564s
mtaylor(mjt)@mtaylor:~/tmp/docker-disk-perf-test$ docker run -it --rm -v "$(pwd):/host-disk" ubuntu /bin/bash 
root@4e9c8bc5e5c1:/# cd /host-disk/
root@4e9c8bc5e5c1:/host-disk# time dd if=/dev/zero of=./output bs=8k count=40k; rm ./output
40960+0 records in
40960+0 records out
335544320 bytes (336 MB, 320 MiB) copied, 10.7496 s, 31.2 MB/s

real    0m10.756s
user    0m0.050s
sys 0m1.090s
root@4e9c8bc5e5c1:/host-disk# exit
exit
mtaylor(mjt)@mtaylor:~/tmp/docker-disk-perf-test$ docker run -it --rm -v "$(pwd):/host-disk:cached" ubuntu /bin/bash 
root@597dc640bdeb:/# cd /host-disk/
root@597dc640bdeb:/host-disk# time dd if=/dev/zero of=./output bs=8k count=40k; rm ./output
40960+0 records in
40960+0 records out
335544320 bytes (336 MB, 320 MiB) copied, 11.1683 s, 30.0 MB/s

real    0m11.172s
user    0m0.060s
sys 0m1.080s
root@597dc640bdeb:/host-disk# exit
exit
mtaylor(mjt)@mtaylor:~/tmp/docker-disk-perf-test$ docker run -it --rm -v "$(pwd):/host-disk:delegated" ubuntu /bin/bash 
root@985e4143053b:/# cd /host-disk/
root@985e4143053b:/host-disk# time dd if=/dev/zero of=./output bs=8k count=40k; rm ./output
40960+0 records in
40960+0 records out
335544320 bytes (336 MB, 320 MiB) copied, 12.1589 s, 27.6 MB/s

real    0m12.165s
user    0m0.080s
sys 0m1.000s
root@985e4143053b:/host-disk# exit
exit
mtaylor(mjt)@mtaylor:~/tmp/docker-disk-perf-test$ docker run -it --rm -v "$(pwd):/host-disk:consistent" ubuntu /bin/bash 
root@3377ae356124:/# cd /host-disk/
root@3377ae356124:/host-disk# time dd if=/dev/zero of=./output bs=8k count=40k; rm ./output
40960+0 records in
40960+0 records out
335544320 bytes (336 MB, 320 MiB) copied, 12.5944 s, 26.6 MB/s

real    0m12.601s
user    0m0.060s
sys 0m0.980s
root@3377ae356124:/host-disk# exit
exit
mtaylor(mjt)@mtaylor:~/tmp/docker-disk-perf-test$ docker --version
Docker version 17.05.0-ce, build 89658be
mtaylor(mjt)@mtaylor:~/tmp/docker-disk-perf-test$ 

Perhaps my expectations on how this works are unreasonable, or I'm doing something wrong. Above are some simple disk performance tests and I'm not seeing any differences.

Would be interested in knowing if my expectations, or use of the flag is incorrect.

barat commented 7 years ago

@matthewjosephtaylor ... :cached won't improve dd tests ... for this, you need to check for :delegated to rollout :)

ToonSpinISAAC commented 7 years ago

The issue text says:

delegated: The container runtime's view of the mount is authoritative. There may be delays before updates made in a container are visible on the host.

Does this mean that if I were to use delegated (or cached for that matter) that syncing would strictly be a one-way affair?

In other words, to make my question clearer, let's say I have a codebase in a directory and I mount this directory inside a container using delegated. Does this mean that if I update the codebase on the host, the container will overwrite my changes?

What I understood until now was, that using for instance delegated, would keep syncing "two-way", but make it more efficient from the container to the host, but the text leads me to believe that I may have misunderstood, hence the question.

geerlingguy commented 7 years ago

Just posting another data point:

Drupal 8 is ~18x faster if you're using a Docker volume to share a host codebase into a container, and it's pretty close to native filesystem performance.

With cached, Drupal and Symfony development are no longer insanely painful with Docker. With delegated, that's even more true, as operations like composer update (which results in many writes) will also be orders-of-magnitude faster!

kostajh commented 7 years ago

I'm seeing good results for running a set of Behat tests on a Drupal 7 site:

On Mac OS, without :cached:

10:02 $ docker exec clientsite_php bin/behat -c tests/behat.yml --tags=~@failing -f progress
...................................................................... 70
...................................................................... 140
...................................................................... 210
...................................................................... 280
.................................................

69 scenarios (69 passed)
329 steps (329 passed)
11m26.20s (70.82Mb)

With :cached:

09:55 $ docker exec clientsite_php bin/behat -c tests/behat.yml --tags=~@failing -f progress
...................................................................... 70
...................................................................... 140
...................................................................... 210
...................................................................... 280
.................................................

69 scenarios (69 passed)
329 steps (329 passed)
4m33.77s (63.33Mb)

On Travis CI (without :cached):

$ docker exec clientsite_php bin/behat -c tests/behat.yml --tags=~@failing -f progress
...................................................................... 70
...................................................................... 140
...................................................................... 210
...................................................................... 280
.................................................
69 scenarios (69 passed)
329 steps (329 passed)
4m7.07s (55.01Mb)

On Travis CI (with :cached):

247.12s$ docker exec cliensite_php bin/behat -c tests/behat.yml --tags=~@failing -f progress
...................................................................... 70
...................................................................... 140
...................................................................... 210
...................................................................... 280
.................................................
69 scenarios (69 passed)
329 steps (329 passed)
4m6.71s (55.01Mb)

Nice work, Docker team! 👏

carn1x commented 7 years ago

Is it expected that both :cached and :delegated can be combined or will they be mutually exclusive?

dsheets commented 7 years ago

@ToonSpinISAAC both :cached and (when it lands) :delegated perform two-way "syncing". The text you cite is saying that, with :cached, the container may read stale data if it has changed on the host and the invalidation event hasn't propagated yet. With :cached, the container will write-through and no new write-write conflicts can occur (POSIX still allows multiple writers). Think of :cached as "read caching". With :delegated, if the container writes to a file that write will win even if an intermediate write has occurred on the host. Container writes can be delayed indefinitely but are guaranteed to persist after the container has successfully exited. flush and similar functionality will also guarantee persistence. Think of :delegated as "read-write caching". Even under :delegated, synchronization happens in both directions and updates may occur rapidly (but don't have to). Additionally, you may overlap :cached and :delegated and :cached semantics will override :delegated semantics. See https://docs.docker.com/docker-for-mac/osxfs-caching/#delegated guarantee 5. If you are using :delegated for source code but your container does not write to your code files (this seems unlikely but maybe it auto-formats or something?), there is nothing to worry about. :delegated is currently the same as :cached but will provide write caching in the future.

@carn1x :cached and :delegated (and :default and :consistent) form a partial order (see https://docs.docker.com/docker-for-mac/osxfs-caching/#semantics). They can't be combined but they do degrade to each other. This allows multiple containers with different requirements to share the same bind mount directories safely.

lmakarov commented 7 years ago

From within a container is there a way to tell which flag was applied for a volume?

I'm getting the same output from mount regardless of the flag used. Is there another way to check?

$ mount | grep osxfs
osxfs on /var/www/project type fuse.osxfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other,max_read=1048576)
Version 17.06.0-rc1-ce-mac13 (18169)
Channel: edge
2425473dc2
dsheets commented 7 years ago

@lmakarov /Applications/Docker.app/Contents/MacOS/com.docker.osxfs state should show you the host directories that are mounted into containers and their mount options. For instance, after I run docker run --rm -it -v ~:/host:cached alpine ash, I then see:

$ /Applications/Docker.app/Contents/MacOS/com.docker.osxfs state
Exported directories:
 - /Users to /Users (nodes_/Users table size: 62)
 - /Volumes to /Volumes (nodes_/Volumes table size: 0)
 - /tmp to /tmp (nodes_/private/tmp table size: 9)
 - /private to /private (nodes_/private table size: 0)
Container-mounted directories:
 - /Users/dsheets into b8f7765665782501bc1a099f1898911b7eb393b08930be638545a55fd06e420e (state=cached)
ToonSpinISAAC commented 7 years ago

Thanks for taking the time to clarify and explain @dsheets!

gondalez commented 7 years ago

Is there any indication from the high sierra dev beta of how this (and docker mac in general) will work under APFS?

Sorry if this is off-topic but it's a question I keep wanting to ask every time I see a new message here :)

WillSquire commented 7 years ago

Unable to get @DanielSchwiperich's example working? Retrieving the error:

invalid spec: .:/var/www/project:cached: unknown option: cached

It doesn't like version being set to 2 either. Perhaps I'm missing something? Currently running 17.06.0-rc2-ce-mac14

27Bslash6 commented 7 years ago

@WillSquire What version of docker-compose are you using?

WillSquire commented 7 years ago

@27Bslash6 Version 1.14.0-rc2. Believe this was installed automatically for me though, as per the docs: https://docs.docker.com/compose/install/

I'm running macOS

Schnitzel commented 7 years ago

so if I understand this issue correct, :delegated is currently (in 17.04 and 17.05) the exact same as :cached? Unfortunately https://docs.docker.com/docker-for-mac/osxfs-caching/ suggests that :delegated and :cached are implemented already different implementations. Could we get some clarification what exactly is correct?

yallop commented 7 years ago

@Schnitzel: Yes, in 17.04, 17.05 and 17.06, :delegated behaves the same as cached.

The documentation is written in terms of the guarantees associated with each flag. :cached has all the guarantees of :delegated, plus some additional ones. (And :consistent has all the guarantees of :cached, plus some additional ones.) So the documentation allows Docker to perform more optimizations with :delegated than with :cached, but doesn't require it to do so.

Here's another way to think of it: switching on :cached or :delegated is a way of granting Docker permission to perform certain optimizations. Docker will never perform those optimizations without permission, and it won't always perform those optimizations even when you give it permission. But if you always give Docker permission to perform the optimizations you want then it'll optimize as well as it can within those constraints.

For a user the best approach is to grant the permissions that fit with your circumstances, and in return Docker will give you the best performance available at the time for those permissions. So if :delegated is the right setting for your application then it's reasonable for you to switch it on now, so that you'll immediately see better performance when Docker releases a more aggressive implementation of :delegated.

Schnitzel commented 7 years ago

@yallop Alright, thanks for that thorough explanation, makes sense now!

btw, here some performance comparison of Docker Machine vs Docker-for-Mac with the new flags: https://stories.amazee.io/docker-on-mac-performance-docker-machine-vs-docker-for-mac-4c64c0afdf99 (I'm updating it now to not differentiate between delegated and cached, blogpost was written while I still was in the impression that they are already different).

Will definitely redo the testing as soon as 17.06 is out! Do you already know what we can expect to land in 17.06 in terms of these improvements?

Thanks a lot for your effort in all of that. Super excited about the performance improvements :)

aegis123 commented 7 years ago

Does anyone have an example for a symfony application using :cached and :delegated do you just mount the whole project dir or mount app/cache vendor src etc separately to get the best performance?

I would think something like this would give the best performance?

volumes:
     - ./src:/builds/application/src:cached
     - ./app:/builds/application/app:cached
     - ./app/cache:/builds/application/app/cache:delegated
     - ./app/logs:/builds/application/app/logs:delegated
     - ./web:/builds/application/web:cached
     - ./vendor:/builds/application/vendor:delegated
     - ./node_modules:/builds/application/node_modules:delegated
WillSquire commented 7 years ago

Now on version 17.06.0-ce-rc5 for Docker and version 1.14.0 for Docker compose, but still recieving "unknown option" for cached and delegated. To give some background, Docker is being explored as a replacement for our current workflow (Vagrant + Ansible) due to the speed claims, so we might not be as 'au fait' to possible caveats. Any help would be appreciated. Thanks

Edit: Have now created a separate issue for this.

mattacular commented 7 years ago

It's not clear whether these features made it into the 17.06 CE stable release today or not. Since they're not mentioned in the blog post or changelog I'm guessing not but can anyone confirm? Thank you

joesteele commented 7 years ago

They did. I'm using the new :delegated flag on today's stable release.

carn1x commented 7 years ago

The delegated flag has been in as a placeholder for a long time without any change in functionality, in case you're assuming that it would give an error.

On Wed, Jun 28, 2017, 19:58 Joe Steele notifications@github.com wrote:

They did. I'm using the new :delegated flag on today's stable release.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/docker/for-mac/issues/1592#issuecomment-311755009, or mute the thread https://github.com/notifications/unsubscribe-auth/AAvzsTGQc4__gy4aD3HuD6BnGwivrDe6ks5sIqI9gaJpZM4NSHso .

Schnitzel commented 7 years ago

@mattacular in https://github.com/docker/for-mac/issues/1592#issuecomment-310062642 @yallop wrote that in 17.06 :delegated is the same as :cached

My own tests show that overall Docker got faster, but Docker-for-Mac is still slower then a Docker-Machine with NFS share (at least for Drupal)

genei09 commented 7 years ago

I'm only seeing a slight improvement with running jest for unit tests, or even just a simple git status. I ran these several times and infrequently the delegated mount was actually slower than rw mount.

command: time jest -o

mode: rw: run0: real 0m52.869s run1: real 0m41.248s run2: real 0m47.749s

mode: delegated: run0: real 0m47.663s run1: real 0m34.802s run2: real 0m37.152s

no volume: run0: real 0m16.811s run1: real 0m10.429s run2: real 0m10.200s

command: time git status

rw: real 0m0.853

delegated: real 0m0.370s

no volume: real 0m0.016s

I've tried reinstalling docker4mac and rebooting the machine. Is there something else I'm missing or is this the performance increase I should expect?

carn1x commented 7 years ago

@genei09 It depends on what the jest unit tests are doing. I can only assume they are doing a lot of writing to the volume if they are seeing such a performance deficit. As mentioned above,:delegated is currently the same as :cached, and :cached only provides performance benefits for read operations. Write operations are still very slow. I believe git status even performs writes (such as applying a git lock maybe?).

markfoodyburton commented 7 years ago

I've just got the 17.06 'update', and my performance over 17.04 is significantly worse (using 'delegated') I can only conclude that the 'delegated' flag didn't make it into the release....? (which is a petty as I saw a significant speed improvement when using it on 17.04)

smartygus commented 7 years ago

@markfoodyburton as written 10 comments back here, and repeated 3 comments back here, in 17.04, 17.05, and 17.06, :delegated was only ever going to behave the same as :cached. So in terms of 'making it into the release', I'm not sure what you are referring to. The flag simply allows Docker to make certain optimisations (which aren't yet developed/ready in the case of :delegated), but does not guarantee any.

As for your specific performance issue, I can't comment, but based on the available information I would expect the performance to be similar. If you're unable to resolve your specific issue, then I would suggest you open a new issue.

markfoodyburton commented 7 years ago

Have done, thanks.

genei09 commented 7 years ago

@carn1x the jest unit tests perform no write operation. Unless there are some writes being hidden from inotify.

210 file Opens 1 file Access

git status has 2 Opens and 1 Create on the lock file

kudos commented 7 years ago

I read most of this issue. Why isn't the huge caveat around delegated in the documentation? There's a whole 8 point list of lies/wishes around it, as far as I can tell.

yallop commented 7 years ago

The 8 point list in the documentation is a specification for delegated, written in terms of guarantees about data integrity, not promises about performance. Specifications are often written in this way -- for example, C has a keyword restrict that allows a compiler to perform more aggressive optimzations, under the assumption that a particular object is not aliased; ignoring restrict altogether for optimization purposes is an entirely legitimate implementation.

The documentation, the blog post, and this issue all say that write caching is under development, not released. You can find the documentation at https://github.com/docker/docker.github.io/ if you'd like to propose improvements.

genei09 commented 6 years ago

Is it fair to say that when a file is opened for writing the performance will still be sub-optimal as compared to strictly opened for read?

that would explain the jest test performance

geerlingguy commented 6 years ago

Is it fair to say that when a file is opened for writing the performance will still be sub-optimal as compared to strictly opened for read?

Yes; at this time, delegated basically does the same thing as cached. Which means reads are much better optimized, but writes are just the same as earlier. If you have a write-heavy workflow (e.g. in my case, running something like composer install on a PHP project with hundreds of dependencies), then write-heavy operations will be very slow on mounted volumes.

genei09 commented 6 years ago

Yes; at this time, delegated basically does the same thing as cached. Which means reads are much better optimized, but writes are just the same as earlier. If you have a write-heavy workflow (e.g. in my case, running something like composer install on a PHP project with hundreds of dependencies), then write-heavy operations will be very slow on mounted volumes.

I'm aware of the delegated and cached being functionally the same. My question is around a file which is OPENED but not actually MODIFIED.

gotgenes commented 6 years ago

@geerlingguy I believe @genei09 is asking this:

Suppose we have a file opened in read-only mode and we measure how long it takes to read the first 100 bytes from that file 100 times. We record this time as readonly_time.

Next, suppose we have this same file opened in read-write mode, and we measure how long it takes read the first 100 bytes from that file 100 times. We record this time as readwrite_time.

(Note that in both scenarios we perform no write operations.)

The questions is, can we reasonably expect that both scenarios would complete in the same amount of time, i.e., are readonly_time and readwrite_time equal?

coyotwill commented 6 years ago

I'm working on a fairly complex project with a dozen containers running on my Mac at the same time. My CPU usage was rocket high until in started using :cached for all my source code volumes. This was a huge improvement for me: I went from 150% CPU usage by the hyperkit process to a tiny 10%.

I figured the main culprits were all the various watch processes monitoring file changes for live-reload and code hot swapping.

Anyway, since this is a huge improvement for most use-case, I would love to see a global option in docker to set all volume mount to :cached by default. People turning that option ON could still use :consistent for some volumes if needed.

mtibben commented 6 years ago

Yeah, I really would like to see a configurable default so that delegated or cached is applied to all volume mounts. Adding these tags in every script or config where docker mounts a volume is proving to be painful

aegis123 commented 6 years ago

@mtibben delegated or cached can't be applied by default since they should either speed up reading container files on the host or the container reading files from the host that are mounted via a volume. I don't think we can apply both on the same time.