pangeo-data / jupyter-earth

Jupyter meets the Earth: combining research use cases in geosciences with technical developments within the Jupyter and Pangeo ecosystems.
https://jupytearth.org
Creative Commons Zero v1.0 Universal
28 stars 6 forks source link

Support scp and rsync with a similar model to how ssh and sftp work #83

Open fperez opened 2 years ago

fperez commented 2 years ago

I've been using the hub for more extensive development, and the need to conveniently synchronize directories continues to reappear. The ssh and sftp support are fantastic, but unfortunately, from my perspective, they don't quite have the immediate convenience of scp for simple transfers and rsync for more complex synchronization. Nothing beats scp foo.tgz hub.jupytearth.org: in terms of simply sending a big file over, with perhaps immediately after doing ssh hub.jupytearth.org to go in and unpack/move it around as needed... Sftp is great for many things, and I'm glad to know protocol-wise it's adopting the sftp tools underneath, but I still think the day-to-day value of this is very high...

And for complex directory synchronization (a very common need when working across machines and hubs), then sftp is simply not the right tool, and I don't know of anything in existence that matches rsync for simultaneous simplicity and performance.

I've been trying quite a bit to make do with the alternatives to see if it was really a matter of adapting my habits, but I'm convinced that robust scp/rsync usage is really a need for at least some very real, legitimate use cases.

I know that the ultimate implementation space is in jupyterhub-ssh and @yuvipanda has already commented in yuvipanda/jupyterhub-ssh#55 opened by @consideRatio, so I'm happy to work there. Just flagging this one as a top-level point for deployment/discussion of that solution in our hub.

BTW - I think it's worth thinking more about generic file transfer ideas as indicated by Erik, and in that context we'll probably want to consider also tools like Globus, which are very relevant in science. But for now, a more pedestrian toolset with these old-school basics (scp and rsync) will already make a big difference (in addition to the sftp support, which I'm grateful for!).