filecoin-project / boost

Boost is a tool for Filecoin storage providers to manage data storage and retrievals on Filecoin.
Other
111 stars 67 forks source link

[Support Ticket]: bitswap stops responding after sector is tried but not found #1811

Closed bobdubois closed 10 months ago

bobdubois commented 10 months ago

Boost component

Boost Version

boostd version 2.1.0-rc1+mainnet+git.0cd9d5d.dirty

Describe the problem

Bitswap stops working after a sector is tried but not found. No other logs found, retrievals work just before that but then no more. The service runs without other errors but retrievals time out. Sector 4011 is NOT available on the system, so shouldn't even be asked for. Restarting booster-bitswap solves the issue but is not workable.

Logging Information

booster-bitswap log:
2023-11-08T03:04:13.691+0100    INFO    remote-blockstore   remoteblockstore/remoteblockstore.go:68 Get failed  {"cid": "bafkreidrspvkbkkwugvvf4peo3fx6d3u2ac4taiz3lrqwr52alql5mpr7a", "error": "1 error occurred:\n\t* getting piece reader: 1 error occurred:\n\t* getting reader over sector 4011: sector is not unsealed\n\n\n\n"}
2023-11-08T03:04:13.691+0100    ERROR   engine  decision/blockstoremanager.go:121   blockstore.Get(bafkreidrspvkbkkwugvvf4peo3fx6d3u2ac4taiz3lrqwr52alql5mpr7a) error: 1 error occurred:
    * getting piece reader: 1 error occurred:
    * getting reader over sector 4011: sector is not unsealed

Repo Steps

  1. Run '...'
  2. Do '...'
  3. See error '...' ...
bobdubois commented 10 months ago

and it's not just 1 sector: 2023-11-08T08:54:02.571+0100 INFO remote-blockstore remoteblockstore/remoteblockstore.go:68 Get failed {"cid": "bafkreigxgnwwhkv3osgskxtjpcuqzvp53pry5gcndnzr45pef3kwgzudwm", "error": "1 error occurred:\n\t getting piece reader: 1 error occurred:\n\t getting reader over sector 4010: sector is not unsealed\n\n\n\n"} 2023-11-08T08:54:02.571+0100 ERROR engine decision/blockstoremanager.go:121 blockstore.Get(bafkreigxgnwwhkv3osgskxtjpcuqzvp53pry5gcndnzr45pef3kwgzudwm) error: 1 error occurred:

it retries the sector multiple times, bitswap doesn't crash on the first occurrence of this log entry.

bobdubois commented 10 months ago

Boost announces to IPNI before handing off to the sealing pipeline. What if the sector does not get sealed (like in this case) and IPNI is not aware? Is this the reason this happens? Still bitswap should choke on it.

bobdubois commented 10 months ago

Graphsync retrievals stopped working as well. No logs found in boostd.log though that are related.

ischasny commented 10 months ago

Seems to have been resolved by updating to RC3.