src-d / borges

borges collects and stores Git repositories.
https://docs.sourced.tech/borges/
GNU General Public License v3.0
52 stars 20 forks source link

Single root fast path #378

Closed jfontan closed 5 years ago

jfontan commented 5 years ago

When a repository has only one root and is the first one to write to it the push step can be skipped and the packfile sent as is. This makes the process quite faster at the expense of disk space.

This process skips RootedTransactioner to download and upload siva files. To be able to use Copier it is added to Archiver.

Also moved logging of time spent to copy siva files from pushChangesToRootedRepository to tidy it a bit.

a: current version b: fast path

repo time a time b size a size b
cangallo 881.5ms 347.6ms (-60%) 0.1MiB 0.1MiB(0%)
octoprint-tft 9s 1s (-84%) 2.8MiB 3MiB(+7%)
upsilon 2m20s 14.6s (-89%) 96.2MiB 98MiB (+2%)
numpy 7m55s 7m26s (-6%) 95MiB 95MiB (0%)
tensorflow 38m32s 37m10s (-3%) 706MiB 706MiB (0%)
bismuth 17m1s 3m48s (-77%) 491MiB 497MiB (+2%)

Note: the repositories "numpy" and "tensorflow" have more than one root so it uses the same code as before. Left here for completion.

Fixes #377

Raw output of regression:

#### Comparing latest - local:HEAD ####
## Repo cangallo ##
Memory: 24236032 -> 19181568 (-20.855163089403415), true
Wtime: 881.486544ms -> 347.626016ms (-60.56366165017716), true
Stime: 250ms -> 30ms (-88), true
Utime: 1.07s -> 170ms (-84.11214953271028), true
FileSize: 0.11093711853027344 -> 0.11100196838378906 (0.05845640699413717), true
## Repo octoprint-tft ##
Memory: 62439424 -> 32030720 (-48.7011283127788), true
Wtime: 9.02956085s -> 1.014957052s (-88.75961889110033), true
Stime: 780ms -> 60ms (-92.3076923076923), true
Utime: 9.78s -> 960ms (-90.1840490797546), true
FileSize: 2.824666976928711 -> 3.0296154022216797 (7.255666843806531), true
## Repo upsilon ##
Memory: 709877760 -> 238895104 (-66.34700825111072), true
Wtime: 2m20.496095764s -> 14.631545255s (-89.58579939503977), true
Stime: 13.31s -> 1.93s (-85.49962434259955), true
Utime: 2m34.51s -> 13.57s (-91.21739693223739), true
FileSize: 96.22913646697998 -> 98.00282859802246 (1.8431965578856715), true
## Repo numpy ##
Memory: 1207513088 -> 1165721600 (-3.460955282001879), true
Wtime: 7m55.740268897s -> 7m26.875567723s (-6.067323508460316), true
Stime: 1m11.01s -> 1m1.45s (-13.462892550345021), true
Utime: 10m3.09s -> 9m30.9s (-5.33751181415709), true
FileSize: 95.48885822296143 -> 95.49099731445312 (0.0022401477319003577), true
## Repo tensorflow ##
Memory: 5028749312 -> 4931715072 (-1.9295899234517262), true
Wtime: 38m32.777768384s -> 37m10.59889938s (-3.55325401892896), true
Stime: 4m42.72s -> 4m6.65s (-12.758205998868139), true
Utime: 50m46.15s -> 49m29.17s (-2.5271244029348523), true
FileSize: 706.3063344955444 -> 706.3145418167114 (0.0011620058841542164), true
## Repo bismuth ##
Memory: 2306678784 -> 1018535936 (-55.84404976258715), true
Wtime: 17m1.713103943s -> 3m48.809022005s (-77.60535505300076), true
Stime: 44s -> 9.4s (-78.63636363636364), true
Utime: 17m40.02s -> 3m50.74s (-78.23248617950604), true
FileSize: 491.3974657058716 -> 497.367374420166 (1.214883903749669), true
jfontan commented 5 years ago

I'm fixing the coverage so broken FS is tested also with push code.