filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.82k stars 1.25k forks source link

Stalled retrievals stops new incoming storage deals from transferring #7539

Closed rjan90 closed 2 years ago

rjan90 commented 2 years ago

Checklist

Lotus component

Lotus Version

Daemon:  1.13.0-rc2+mainnet+git.a1a7ea66c+api1.4.0
Local: lotus version 1.13.0-rc3+mainnet+git.c23cff45f

Describe the Bug

If the storage provider receives retrievals that stops due to for example a stream reset, it seems to impact later incoming storage-deals to not start the transfer. I have experienced this behavior twice, so thought it was worthy a ticket.

So this was my view of lotus-miner data-transfers list:

Sending Channels

ID                   Status           Sending To   Root Cid     Initiated?  Transferred  Voucher                                   
1634579593870108898  Ongoing          ...m5rYc8VD  ...3zpbzada  N           31.94MiB     ...lIncrease":1048576,"UnsealPrice":"0"}  
  Message: stream reset
1634580045873407428  Ongoing          ...m5rYc8VD  ...3zpbzada  N           362.5MiB     ...lIncrease":1048576,"UnsealPrice":"0"}  
  Message: stream reset
1634580530481520217  Ongoing          ...m5rYc8VD  ...kpmoyiim  N           2.658MiB     ...lIncrease":1048576,"UnsealPrice":"0"}  
  Message: stream reset
1634597923690491082  Ongoing          ...m5rYc8VD  ...5ozef35m  N           21.99GiB     ...lIncrease":1048576,"UnsealPrice":"0"}  
  Message: stream reset

  Receiving Channels

ID                   Status           Receiving From  Root Cid     Initiated?  Transferred  Voucher                                   
1631206380224510302  Ongoing          ...iTLZXDjt     ...6v3dmo6m  N           0B           ...m2pq2qxf25n3jmpo4bmu5wjudnsy5up7yy"}}  
1634604531202059314  Ongoing          ...bc8Hs8Nd     ...ahxxky3a  N           0B           ...bujb74qtmegatn3cf77ptei46kgpalfnbi"}}  
  Message: graphsync request cancelled
1634622720975052977  Ongoing          ...MTqHYBjw     ...qnpM3JX1  N           0B           ...nlufzhrvyrhbyudklwxtkrkpmtygmkkmni"}}  
  Message: graphsync request cancelled

All the receiving storage deals came after the retrievals had stalled. The first time I experienced this, a full clean-up off all the transfers (cancelling retrievals & receiving channels), and a restart of the markets-node made incoming storage-deals transfer again.

Logging Information

graphsync_allocator
relay
conngater
paych
markets-rtvl-reval
retrieval
retrievaladapter
diversityFilter
chain
cli
providerstates
test-logger
cborrrpc
panic-reporter
ffi-wrapper
addrutil
routedhost
storedask
autonat
nat
swarm2
tarutil
fsjournal
wallet
retrieval-discovery
cliutil
storagemarket_impl
storageadapter
mocknet
filestore
reuseport-transport
sub
modules
ulimit
bs:peermgr
bs:sess
retrievalmarket_impl
importmgr
miner
piecestore
graphsync_network
peerstore/ds
blockservice
fsutil
bitswap_network
dt_graphsync
client
auth
alerting
statetree
storagemarket_network
ping
cmds/http
storagemrkt
discovery
table
cmds
rand
pathresolv
gen
markets-rtvl
system
builder
repo
providers
quic-transport
connmgr
boguskey
amt
blockstore
paramfetch
blankhost
peerstore
main
partialfile
actors
routing/record
bitswap
chunk
badgerbs
messagesigner
disputer
dagstore/upgrader
metrics
bs:sprmgr
data-transfer
stream-upgrader
dht.pb
gs_request_executor
autorelay
node
data_transfer_network
pubsub
build
ipns
sectoraccessor
fil-consensus
messagepool
watchdog
storageminer
tracing
types
fsm
dt-impl
fullnode
peermgr
p2p-config
unixfs
gs-traversal
p2pnode
stores
evtsm
badger
dht/RtRefreshManager
backupds
beacon
statemgr
retrieval_network
gs-asyncloader
gs-unverifiedbs
dht
events
market_adapter
graphsync
hello
net/conngater
rpc
chainstore
sectors
dt-chanmon
tcp-tpt
vm
incrt
wallet-ledger
splitstore
eventlog
markets
ffiwrapper
rpcenc
engine
dagstore
net/identify
drand
advmgr
genesis
mplex
payment-channel-settler
lock
chainxchg
retrieval-fsm
basichost
preseal

Repo Steps

  1. Receive retrieval deals that errors out / stops transferring
  2. Get a storage-deal
  3. Storage-deals won't start transferring ...
rjan90 commented 2 years ago

Not seen this issue on the upcoming release candidates for 1.13.2. Retrievals does not seem to stop new incoming storage deals on release-candidate 2 and up!