filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.84k stars 1.26k forks source link

Splitstore fails early compactions on receipts not found #10260

Open ZenGround0 opened 1 year ago

ZenGround0 commented 1 year ago

Checklist

Lotus component

Lotus Version

idk

Describe the Bug

Thanks to @ribasushi we have some evidence of compaction erroring on startup because receipts are not added to snapshot. It appears that this happens in messages mode but maybe discard mode (need to examine config below)

This is a bug which we can fix. However this problem resolves itself after ~4 finalities when the receipts are computed. So its probably not the root cause of discard not discarding.

Logging Information

Compaction error logs

2023-02-04T21:14:37.473Z    ERROR   splitstore  splitstore/splitstore_compact.go:536    COMPACTION ERROR: error marking: error walking block (cid: bafy2bzaceczpzccm2faelihcxwtyhc5zjz6xrim22irzur7j5yrs7krd6fohu): error walking message receipts (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): error scanning linked block (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): ipld: could not find bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq
2023-02-05T00:02:56.098Z    ERROR   splitstore  splitstore/splitstore_compact.go:536    COMPACTION ERROR: error marking: error walking block (cid: bafy2bzacecwes64aagmt5eyerscpbicvshfzfqrribh2sx6mw5za5bd24mx6u): error walking message receipts (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): error scanning linked block (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): ipld: could not find bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq
2023-02-05T03:36:34.031Z    ERROR   splitstore  splitstore/splitstore_compact.go:536    COMPACTION ERROR: error marking: error walking block (cid: bafy2bzaceczpzccm2faelihcxwtyhc5zjz6xrim22irzur7j5yrs7krd6fohu): error walking message receipts (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): error scanning linked block (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): ipld: could not find bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq
2023-02-05T08:21:33.125Z    ERROR   splitstore  splitstore/splitstore_compact.go:536    COMPACTION ERROR: error marking: error walking block (cid: bafy2bzaceb73otita5kkb33tc7afwbhkqjf34b7criguoxjlmc5xguufjz2be): error walking message receipts (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): error scanning linked block (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): ipld: could not find bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq
2023-02-05T14:09:33.845Z    ERROR   splitstore  splitstore/splitstore_compact.go:536    COMPACTION ERROR: error marking: error walking block (cid: bafy2bzacecwes64aagmt5eyerscpbicvshfzfqrribh2sx6mw5za5bd24mx6u): error walking message receipts (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): error scanning linked block (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): ipld: could not find bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq
2023-02-05T21:33:47.433Z    ERROR   splitstore  splitstore/splitstore_compact.go:536    COMPACTION ERROR: error marking: error walking block (cid: bafy2bzaceb73otita5kkb33tc7afwbhkqjf34b7criguoxjlmc5xguufjz2be): error walking message receipts (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): error scanning linked block (cid: bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq): ipld: could not find bafy2bzaced6qwy3qvlykm23olkomk7sqvpj3vqboueewavqz2oaryi5qj3pbq

node configured:

LOTUS_CHAIN_BADGERSTORE_DISABLE_FSYNC=1 \
LOTUS_CHAINSTORE_ENABLESPLITSTORE=1 \
LOTUS_CHAINSTORE_SPLITSTORE_COLDSTORETYPE=discard \
LOTUS_CHAINSTORE_SPLITSTORE_MARKSETTYPE=badger \
LOTUS_CHAINSTORE_SPLITSTORE_HOTSTOREFULLGCFREQUENCY=1 \
LOTUS_CHAINSTORE_SPLITSTORE_COLDSTOREFULLGCFREQUENCY=0 \


### Repo Steps

Start splitstore with the above config and look at logs for 2 days
ZenGround0 commented 1 year ago

Im starting to suspect that this is an issue with not entering warmup properly since I think this is exactly the point of warmup. Will need to dig further.