pachadotdev / analogsea

Digital Ocean R client
https://pacha.dev/analogsea/
Apache License 2.0
154 stars 24 forks source link

droplet_download and upload should use ssh and tar #87

Closed wch closed 9 years ago

wch commented 9 years ago

On my slow cafe connection, downloading 130 files totaling 878KB via scp takes 69 seconds. Using and sending a .tgz file over a pipe takes 3.7 seconds.

$ time scp -r -o BatchMode=yes -o StrictHostKeyChecking=no \
  -o UserKnownHostsFile=/tmp/adsfasdf/hosts \
  analogsea@104.131.186.100:/srv/r-check/results/676da4ba4b59 ./
R6-Ex.Rout                                          100% 7531     7.4KB/s   00:00    
R6-Ex.pdf                                           100% 3611     3.5KB/s   00:01    
Rdlatex.log                                         100%   22KB  22.1KB/s   00:00    
...
...

real    1m9.288s
user    0m0.015s
sys 0m0.019s
$ time ssh -o BatchMode=yes -o StrictHostKeyChecking=no \
  -o UserKnownHostsFile=/tmp/adsfasdf/hosts  \
  analogsea@104.131.186.100 "cd /srv/r-check/results/ && tar cz 676da4ba4b59"  \
  | tar xz

real    0m3.730s
user    0m0.021s
sys 0m0.027s

I believe scp is so slow because it transfers each file separately. For droplet_download and droplet_upload, when doing multiple files, it will be much more efficient to use ssh and tar.

sckott commented 9 years ago

Sounds good ! Looks like changes in fff8dffbcc254d0b5d9598edf076072ab95108cd

wch commented 9 years ago

I did it for downloads, but I think it'll be a little trickier for uploads because there's some logic for how to handle renaming and overwriting files and directories that I implemented in R for downloads, but would need to be implemented in the shell on the upload side.

Also, I have my RStudio configured to strip trailing whitespace so there were a lot of changes that were just whitespace. What do you think about enabling that option in the .Rproj file?

sckott commented 9 years ago

ah good point, strip added to .Rproj file

wch commented 9 years ago

Great, thanks!

sckott commented 9 years ago

@wch Do you think this is done? Or more to do?

wch commented 9 years ago

I never implemented this for uploads, but I don't mind if you close the issue, since my use case was satisfied -- I just needed faster downloads.

sckott commented 9 years ago

Okay, close this for now, and we can revisit this for uploads if needed later on