filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.83k stars 1.25k forks source link

Restarting the data transfer increases bytes transfered with no network traffic #5225

Closed xinaxu closed 3 years ago

xinaxu commented 3 years ago

Describe the bug First, there have a been a few bug reports related to data transfer stuck so I'm not repeating that here. The issue my miner experienced is when I restart the data transfer using lotus-miner data-transfers restart XXX. The number of bytes transfered will increase however there is no network activity.

To Reproduce Steps to reproduce the behavior: Assume there is a data transfer channel for storage deals that has transfered 321MiB and stuck.

  1. lotus-miner data-transfers restart XXX where XXX is the channel ID listed in lotus-miner data-transfers list
  2. Run watch lotus-miner data-transfers list, the bytes transfered for restarted channel will keep increasing till 642MiB and stuck again
  3. Meanwhile, use nload to monitor network usage shows no obvious traffic increase while the bytes transfered keep increasing
  4. Repeat and observe the data transfer channel start again and stuck at 321MiB * N. The value can keep increasing far beyond the piece size.

Expected behavior Either the data transfer should begin with real network traffic, or the data transfered should not increase.

Screenshots If applicable, add screenshots to help explain your problem. image

Version (run lotus version): lotus version 1.3.0+git.19d457ae5.dirty

Additional context https://filecoinproject.slack.com/archives/C01AZP8BKRQ/p1608309045337100?thread_ts=1608251272.292100&cid=C01AZP8BKRQ @dineshshenoy

dirkmc commented 3 years ago

Thank you for the bug report.

I think the underlying cause is that on restart we are double-counting sent / received blocks in the data transfer layer. I've opened an issue here: https://github.com/ipfs/go-graphsync/issues/141

dineshshenoy commented 3 years ago

Closing since being tracked in graphsync#141