facebook / wdt

Warp speed Data Transfer (WDT) is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.
https://www.facebook.com/WdtOpenSource
Other
2.86k stars 391 forks source link

Seems slower than alternatives #154

Closed dimitry-ishenko closed 7 years ago

dimitry-ishenko commented 7 years ago

Setup: 2 boxen (localhost and remotehost) on the same 10/100 switch. Transferring single 4.1G file (large_file).

Test case 1: admin@localhost [~/tmp/xfer] time scp large_file remotehost:~/tmp/xfer real 1m2.114s user 0m22.907s sys 0m5.430s

Test case 2: admin@remotehost [~/tmp/xfer] socat -u tcp-listen:6969 create:large_file admin@localhost [~/tmp/xfer] time socat -u file:large_file tcp:remotehost:6969 real 1m2.400s user 0m0.610s sys 0m3.857s

Test case 3: admin@remotehost [~/tmp/xfer] wdt -overwrite admin@localhost [~/tmp/xfer] time wdt -connection_url 'wdt://remotehost?enc=2:...&id=...&ports=...&recpv=27' real 1m25.114s user 0m4.010s sys 0m2.577s

wdt seems to be slowest of the 3. Am I doing something wrong?

ldemailly commented 7 years ago

can you attach the full log of both side with the -enable_perf_stat_collection flag ?

maybe your destination is on disk/is slow and can't handle the parallelization you can try -option_type=disk in that case

calculating by hand (wdt would output the rate at the end if you would attach that output) it seems your get the fastest around 67 Mbytes/sec which is indeed likely to be a disk bandwidth - the disk option should fix that (but it'd be much better you get a flash drive or use /dev/shm )

ps: note that even in your output, compared to scp wdt uses 6.5 cpu seconds while scp uses 28 so it's less taxing to the system, but must be slower because of some bottleneck that the log and the options above will show

dimitry-ishenko commented 7 years ago

@ldemailly here are the logs with -enable_perf_stat_collection enabled.

Re-run of the original transfer: localhost.log.txt remotehost.log.txt

Same with -option_type=disk: localhost-disk.log.txt remotehost-disk.log.txt

NB: localhost is the sender and remotehost is the receiver

You are correct both source and destination drives are HDDs. I will try using a flash drive as destination next to see if it makes a difference.

ldemailly commented 7 years ago

I think the issue is scp/socat are "cheating" in the sense that when the end the buffers are still being written to the disk: you can confirm that by running "iostat -w 1" on the receiving machine and see there are still writes going on /disk busyness after they return while wdt forces periodic syncing by default

if you want the same behavior try adding -num_port=1 -disk_sync_interval_mb=-1

dimitry-ishenko commented 7 years ago

@ldemailly you are right. Once I've added the -disk_sync_interval_mb=-1 option, I get comparable time.

Thank you for clarifying. This issue can be closed now.