filecoin-project / lassie

A minimal universal retrieval client library for IPFS and Filecoin
Other
107 stars 17 forks source link

Retrieval deadlock under load in @magik6k's RIBS retrieval #344

Closed hannahhoward closed 12 months ago

hannahhoward commented 1 year ago

@magik6k is having problems where retrieval is locking up in his usage of Lassie within RIBS.

He's finding that retrievals lockup after a few blocks. When he turns up ConcurrentSpRetrievals to 100, it goes for longer but locks up again after 100 or so blocks. His go routine dump, attached below, suggests a lockup in two places in parallelpeerretriever.go, one the call to PriorityWaitQueue.wait and the other in retrievalShared.sendEvent. This could be related to #343. Either way, I suspect it's causing concurrentspretrievals to get hit. magikdump.txt

hannahhoward commented 1 year ago

Slack discussion: https://filecoinproject.slack.com/archives/CP50PPW2X/p1688392651130039

rvagg commented 1 year ago

Likely fix: https://github.com/ipfs/go-graphsync/pull/428

rvagg commented 12 months ago

closing due to no further reports of problems and no additional information to guide further investigation