facebook / wdt

Warp speed Data Transfer (WDT) is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.
https://www.facebook.com/WdtOpenSource
Other
2.86k stars 391 forks source link

command line improvements #144

Open ldemailly opened 7 years ago

ldemailly commented 7 years ago

Dumping some feedback collected internally.

Option for quieter output:

Not a justification but I’ll explain the issue: The library part of wdt runs in services, there it important to have somewhat verbose logging in case something goes wrong so we don’t need to try to “repro” what happened if it’s rare/unexpected. I am thinking maybe we can do like buck etc and direct the glog to a file by default on the command line that most user could just ignore and only attach to problem reports

Potential Url changes

It wouldn’t be a url if the key value pairs weren’t separated by &, but yes it’s a gotcha, which dovetails in your next point about stdin: it’s by design we want to read and write the url on stdin – so it doesn’t get mangled by the shell, so it’s safe (not showing in ps or /proc …), and so it’s transmitted securely (via ssh) When using thrift at fb it’s also not a problem as it’s just a blob We could maybe add a shell escaping option or use a different format (not the url) for the command line

get the URL: wdt recv

use the URL: wdt send URL file1 file2 ...

it actually does work to do both on stdin, the manifest from file or stdin, the url from stdin or command line – but adding an option to get the manifest from the commandline could be done – one twist is the manifest is actually “filename optionalsizetosend optionalmode” in it’s full spec and mapping that on the command line would be harder you can do (wdt –fork ; echo file1 file2) | wdt –manifest - - which means read both the url and manifest from stdin, and the –fork on the receiver side makes it fork after outputting the url thus freeing up the process to emit the manifest

get the URL: wdt send file1 file2 ...

use the url: wdt recv URL

right now the url has to be picked by the side being the tcp server side, which is the receiver. We do have some plan of a “wdt service” that would be running on permanent ports on all boxes and a client that would just instruct which files to send and the services would talk to each other using thrift for meta and exchanges files – this exist in some form within stargate

You can change the process command line to hide the sensitive part from ps if you want. I get that security is a concern but having to echo is rough from a usability standpoint.

You can add –overwrite if you are ok with overwriting existing files, it’s “safe” by default maybe we should mention that option when erroring out