Warp speed Data Transfer (WDT) is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.
log messages are way too verbose by default. ideally there are no messages except progress bar, URL, etc. it's just super hard to understand what's going on or what's wrong.
Not a justification but I’ll explain the issue:
The library part of wdt runs in services, there it important to have somewhat verbose logging in case something goes wrong so we don’t need to try to “repro” what happened if it’s rare/unexpected.
I am thinking maybe we can do like buck etc and direct the glog to a file by default on the command line that most user could just ignore and only attach to problem reports
Potential Url changes
using ampersands in the URL means I have to quote things. it seems needless; can you use a different, non-shell escape char?
It wouldn’t be a url if the key value pairs weren’t separated by &, but yes it’s a gotcha, which dovetails in your next point about stdin: it’s by design we want to read and write the url on stdin – so it doesn’t get mangled by the shell, so it’s safe (not showing in ps or /proc …), and so it’s transmitted securely (via ssh)
When using thrift at fb it’s also not a problem as it’s just a blob
We could maybe add a shell escaping option or use a different format (not the url) for the command line
expecting URLs on stdin is awkward, especially when you want a manifest file option. I would suggest finding a way to allow urls on the command line and files, too. something like...
get the URL: wdt recv
use the URL: wdt send URL file1 file2 ...
it actually does work to do both on stdin, the manifest from file or stdin, the url from stdin or command line – but adding an option to get the manifest from the commandline could be done – one twist is the manifest is actually “filename optionalsizetosend optionalmode” in it’s full spec and mapping that on the command line would be harder
you can do (wdt –fork ; echo file1 file2) | wdt –manifest - -
which means read both the url and manifest from stdin, and the –fork on the receiver side makes it fork after outputting the url thus freeing up the process to emit the manifest
being able to specify the URL on either the sender or receiver would be nice (maybe it can do that already)? so it would then be something like
get the URL: wdt send file1 file2 ...
use the url: wdt recv URL
right now the url has to be picked by the side being the tcp server side, which is the receiver. We do have some plan of a “wdt service” that would be running on permanent ports on all boxes and a client that would just instruct which files to send and the services would talk to each other using thrift for meta and exchanges files – this exist in some form within stargate
You can change the process command line to hide the sensitive part from ps if you want. I get that security is a concern but having to echo is rough from a usability standpoint.
it seems like the tool fails if the destination already is copied?
You can add –overwrite if you are ok with overwriting existing files, it’s “safe” by default
maybe we should mention that option when erroring out
Dumping some feedback collected internally.
Option for quieter output:
Not a justification but I’ll explain the issue: The library part of wdt runs in services, there it important to have somewhat verbose logging in case something goes wrong so we don’t need to try to “repro” what happened if it’s rare/unexpected. I am thinking maybe we can do like buck etc and direct the glog to a file by default on the command line that most user could just ignore and only attach to problem reports
Potential Url changes
It wouldn’t be a url if the key value pairs weren’t separated by &, but yes it’s a gotcha, which dovetails in your next point about stdin: it’s by design we want to read and write the url on stdin – so it doesn’t get mangled by the shell, so it’s safe (not showing in ps or /proc …), and so it’s transmitted securely (via ssh) When using thrift at fb it’s also not a problem as it’s just a blob We could maybe add a shell escaping option or use a different format (not the url) for the command line
get the URL: wdt recv
use the URL: wdt send URL file1 file2 ...
it actually does work to do both on stdin, the manifest from file or stdin, the url from stdin or command line – but adding an option to get the manifest from the commandline could be done – one twist is the manifest is actually “filename optionalsizetosend optionalmode” in it’s full spec and mapping that on the command line would be harder you can do (wdt –fork ; echo file1 file2) | wdt –manifest - - which means read both the url and manifest from stdin, and the –fork on the receiver side makes it fork after outputting the url thus freeing up the process to emit the manifest
get the URL: wdt send file1 file2 ...
use the url: wdt recv URL
right now the url has to be picked by the side being the tcp server side, which is the receiver. We do have some plan of a “wdt service” that would be running on permanent ports on all boxes and a client that would just instruct which files to send and the services would talk to each other using thrift for meta and exchanges files – this exist in some form within stargate
You can change the process command line to hide the sensitive part from ps if you want. I get that security is a concern but having to echo is rough from a usability standpoint.
You can add –overwrite if you are ok with overwriting existing files, it’s “safe” by default maybe we should mention that option when erroring out