facebook / wdt

Warp speed Data Transfer (WDT) is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.
https://www.facebook.com/WdtOpenSource
Other
2.86k stars 392 forks source link

wdt to accept input from stdin #196

Open scherepanov opened 5 years ago

scherepanov commented 5 years ago

Thanks for very nice tool, very useful. wdt command-line tool lacks ability to accept data from stdin. I suspect that wdt library itself cannot accept input from stdin. I do have a case when I need to avoid writing very large intermediate data set to a disk. It would be perfect if I can pipe data to wdt stdin. (I do produce data with high enough speed to saturate network link). Seems to be solution would be to include wdt library and use wdt API. I have not look on API but pretty sure I will be able to feed wdt library with memory buffers and have them sent over to a remote file. From other side, it would be much less involved approach if wdt command-line tool can accept data from stdin. Please let me know if adding stdin to wdt is feasible. And, if it will be a PR, will it be accepted.

ldemailly commented 5 years ago

if you look at fhe examples you will see wdt uses stdin / stdout already for exchanging the secret key between source and destination..

If you have data to transfer that is in memory already you can indeed use the library to make a data source, but remember most of the benefits from wdt comes from splitting files and sending chunks in parallel, which you can’t do if you read sequentially from a single source

scherepanov commented 5 years ago

Thanks for quick answer! Yes I saw that session key is being exchanged between receiver and sender wdt by sending over stdin.

That does not really help, as I want to supply data to stdin of wdt on sender. Receiver should write to a file (though it does not hurt if receiver can send to stdout).

"you cannot send chunks in parallel if reading from stdin" - that would be a good question for interview, how to read data from stdin, chunk them and send in parallel. I doubt you will not be able to answer. I coded couple of utilities that are doing exactly this. Speed of linux pipe is close to 2500MB/sec, you can measure with dd if=/dev/zero bs=1M count=1000 | dd of=/dev/null bs=1M count=1000 That speed is quite sufficient to feed wdt for data transfer.

On linux, ability to read from stdin and write to stdout is a high performing option and should not be ignored.

And, looks like you answered first my question - yes, it is possible to use wdt API to send chunks of memory in parallel.

What about next question - if it will be PR - add stdin/stdout to wdt, would it be accepted?