EugenMayer / docker-sync

Run your application at full speed while syncing your code for development, finally empowering you to utilize docker for development under OSX/Windows/*Linux
GNU General Public License v3.0
3.53k stars 289 forks source link

New Strategy: Native_osx #316

Closed EugenMayer closed 7 years ago

EugenMayer commented 7 years ago

Consider that we do the following on osx :

native mount (osxfs) [host sync strategy/watcher] <-> sync container eugemayer/unison (unison) <-> named volume mount <-> app

This way we could remove unison/unox pain under osx, the pain of the native dependencies and also it's flaws while still maintaining the native speed in the app.

The point here is, that unison needs a fswatch under OSX, which ends up being unox, which then uses macfsevents / watchdog. Those to tend to have a good chunk of CPU usage on huge file tres and also reported to sometimes not get the events right ( race conditions). Also unixon/unox are the only native dependencies on the OS, eliminating them would lead to actually the case that you just need the docker-sync gem, running on the system ruby - thats it.

Beside that, we would get a very reliable way of syncing 2 way from OSX, while unison under linux just uses inotify and is very reliable.

EugenMayer commented 7 years ago

@ignatiusreza implemented first version of this in the feature/osxfs-sync branch

if has:

So what we do is

host 'src folder' <-osxfs-> app-sync-container '/host_sync' <-unison sync-> app-sync-container '/app-sync' <-volume-name-mount-> app-container

You see that app-container does just mount a named-volume, which is just a normal folder on the app-sync-container in mounted at /app-sync. This folder is not shared to the host directly, but using unison syncing - and that makes the trick. This way, the performance is native as it was with the old unison container.

I cannot use user create volume stuff, since that seem to not work under OSX.

You can try this in the boilerplate in the native_osx folder. So to try this branch

git clone https://github.com/EugenMayer/docker-sync
cd docker-sync
git checkout osxfs-sync
./deploy-locally.sh

Now lets try

git clone https://github.com/EugenMayer/docker-sync-boilerplate
cd docker-sync-boilerplate/native_osx
docker-sync-stack start

No you can change on both sides and this work :)

EugenMayer commented 7 years ago

@mickaelperrin any chance esp. you try this - since it originates out of an idea you once had. How does it perform for you? @masterful what does your coworker say about this using it in his project once?

EugenMayer commented 7 years ago

I made some new benchmarks an published those here: https://github.com/EugenMayer/docker-sync/wiki/Performance-Tests-2017

Those are looking pretty promising already ( beside being shocked how slow d4m is ootb compared to vmware fusion).

The clue with native_osx is, that the CPU usage during the sync/watch is basically so low and the sync is 100% accurate. There is a delay ( 1s ) that files get synced, since they still get synced, but they "watching" happens on Linux, thus inotify and thus 100% robust.

This way we remove all OSX dependencies, all setup hazzle and beside that, getting a lot better performance ( CPU usage ) and reliability. Sounds like a big win for now

EugenMayer commented 7 years ago

I published a 0.4.0-beta1, you can now easily setup it

gem install docker-sync --version=0.4.0-beta1
git clone https://github.com/EugenMayer/docker-sync-boilerplate
cd docker-sync-boilerplate/native_osx
docker-sync-stack start

and revert back to stable if you wish

gem install docker-sync

Article for the setup

General description of native_osx including its concept /pros/cons

mickaelperrin commented 7 years ago

@EugenMayer That's interesting. To be honest, I didn't check anymore the performance and reliability of "magic-sync". Mainly for 3 reasons:

So now, I use the eugenmayer/unison image as part of my docker-compose development override file. I wrote a simple bash script that does port auto-discovery, synchronisation of configuration and settings the ulimit on the host and the container.

So far, I didn't notice any sync issue and it's really fast. Of course, if you have to sync a huge codebase from your computer to the server it could take some time initially. But in my use case, it's acceptable and it should be done only once.

So, I don't guarantee you that I will check this new sync strategy until I need to work locally.

EugenMayer commented 7 years ago

@mickaelperrin fair enough, so you really go forward with the remote part. I really find that one interesting. Think about integrating it as a first citizen into docker-sync.

If you login with your docker cli using the remote credentials, thus docker info will show the remote server, you will be able to run your stuff out of the box due to the new auto-ip discovery. You will not need to do any customisation. If you need more info, just ping me in gitter about that - maybe that even simplifies your case. In addition, with 0.3 watchdog was introduced and so far it seems it is very reliable, more then macfsevents - so you might be even happier nowdays.

If you do not use local development anymore, no need to look at this strategy - the point triggering was that you actually know a lot of those things i did and could give some valuable feedback.

I am running the new stack with our biggest project right now, about 90k files, 3 syncs with 7.6%CPU usage by docker for mac. That is really impressive.

I am really planning to make this the new default strategy, if things go further as well as they do right now.

masterful commented 7 years ago

@EugenMayer - my coworker did some informal bench testing on the Docker for Mac Edge release, and changed the volumes to be:

    volumes:
      - /local/path:/docker/path:cached # "cached" mode is the difference, here

His tests involved making changes to medium sized files and seeing how long it took to propagate to the container. All of these were sub-second, and was thus deemed "fast enough." That's not to say that you couldn't improve these times with docker-sync - but the impression I got was that we could get better multi-platform support with a single docker-compose file if we removed docker-sync from the picture ... Since our office is multi-platform, it may make sense in our use-case to do so when cached mode is supported in the latest stable Docker for Mac.

EugenMayer commented 7 years ago

@masterful is that actually an entirely new topic? Or lets say, this actually would mean, docker-sync would not be needed, which i doubt, but still would be great. With native_osx changes also take a while ( sub second ) so no real difference - but what about read/write performance?

could you let him do a

time dd if=/dev/zero of=/var/www/test.dat bs=1024 count=100000 

while /var/www should be his osxfs based mount - what speed does he get? Can you enter those results here: https://github.com/EugenMayer/docker-sync/wiki/Performance-Tests-2017 ?

@masterful beside, :cached would by no means be cross platform, thats the point. You would need to have different docker-compose-dev* files for each platform, thats excatly what we are avoiding with @ignatiusreza .. with the latest 0.3.6 release, the unison strategy becomes "native" under linux, mounting the folder straight, no sync. This way, the same docker-sync.yml will work under OSX and under linux

masterful commented 7 years ago

:cached wouldn't do anything on Linux, no - but it wasn't meant to, either - on Linux it's ignored (so long as you're using a recent release of docker). I'll see if I can get my coworker to run the test in cached mode - but I believe I saw some tests in the docs for the docker pr

EugenMayer commented 7 years ago

@masterful yes there are, and with preliminary cached semantics hot: 7.6s (3.38×), means its still 3x slower then native - and thats still bad. But yeah, having those benchmarks in line would help everybody for the transperency

EugenMayer commented 7 years ago

@masterful did a edge benchmark on a i7 4.2 GHZ desktop CPU with 32GB ram with this docker-compose.yml: https://gist.github.com/EugenMayer/0178d5938802100a44627a8679b780d1

test_1  | 100000+0 records in
test_1  | 100000+0 records out
test_1  | real  0m 17.65s
test_1  | user  0m 0.11s
test_1  | sys   0m 1.42s

17,65s - thats close to "horrible", thats 55 times slower then the native_osx solution. Did i get something wrong? Why would somebody consider this? Used https://goo.gl/FlXeUT and 17.05.0-ce-rc1-mac8

can someone also see the same results?

masterful commented 7 years ago

Probably didn't get anything wrong - my hunch is that the performance is still not the best for that use case (delegated will likely be more ideal), just better than what it was before - possibly at a point where it's usable for some folks (a slow initial start up and then decent speeds for small groups of file changes). I can respond with more info if my coworker gets around to running the test.

EugenMayer commented 7 years ago

@masterful i updated the performance tests just right now https://github.com/EugenMayer/docker-sync/wiki/Performance-Tests-2017 and seeing that, the new :cached mode will not change anything. A factor of about 50-60 times slower is not acceptable - its not even a lot faster we are talking about 1s faster then without cache.

Before you consider :edge .. rather look at fusion, its 12s compared to 17s... (still horrible, but better)

Using delegated sounds like a nice risk, since the container wins on collisions, and thats not what you want when you write code - from the host.

At least for now it sounds reasonable to still focus on docker-sync + native_osx. For you, to be able to use it cross-platform, use what @ignatiusreza has written, it makes docker-sync usable the same way under linux, as it would work under OSX - just removes anything like unison or whatever, it just directly mounts the host volume, as you would expect under linux ( since it has now impact )

EugenMayer commented 7 years ago

finished implementation