brikis98 / docker-osx-dev

A productive development environment with Docker on OS X
http://www.ybrikman.com/writing/2015/05/19/docker-osx-dev/
MIT License
1.43k stars 106 forks source link

do initial sync using tar #157

Closed ComaVN closed 8 years ago

ComaVN commented 8 years ago

This speeds up the first time you sync, particularly for a large number of files

fixes issue #104

Because the performance improvement is so large, even with moderately sized projects, I haven't made it optional yet, so no --tar flag is implemented. If that's a problem I can always make it optional.

performance test: medium size repo (75.2MB, 5238 files, 771 directories, 38 symlinks)

with tar:

% /usr/bin/time ~/Projects/docker-osx-dev/src/docker-osx-dev sync-only
(...)
2015-12-17 20:58:04 [INFO] Initial sync done
        7.55 real         1.66 user         0.90 sys

without tar:

% /usr/bin/time docker-osx-dev sync-only
(...)
2015-12-17 21:01:00 [INFO] Initial sync done
       19.21 real        10.27 user         9.76 sys

I do see some files that are rsynced regardless of using tar, specifically symlinks and some generated files. I suspect this is due to a difference in handling of modification times and file permissions. This doesn't seem to be a problem: rsync fixes them, and the resulting tree on the boot2docker-vm is exactly the same.

brikis98 commented 8 years ago

Thank you for the PR. This is an awesome change! I left a bunch of comments through out. The only other thing to add is that the initial_sync function has gotten pretty large, so I'd propose creating two smaller functions:

Then the initial_sync function just calls the two functions above, passing them whatever params they need. This might mean you have to loop over paths_to_sync twice, but that will make no difference in performance, but a big difference in readability.

ComaVN commented 8 years ago

Cleaned up the code a bit per your suggestion to split the initial_sync function.

brikis98 commented 8 years ago

Fantastic, thank you!