kadena-io / chainweb-node

Chainweb: A Proof-of-Work Parallel-Chain Architecture for Massive Throughput
https://docs.kadena.io/basics/whitepapers/overview
BSD 3-Clause "New" or "Revised" License
248 stars 95 forks source link

Compaction: Slowed to never ending or halted on "Inserting compacted rows into table SYS:Pacts" on chain 0 #1989

Closed trendzetter closed 2 months ago

trendzetter commented 2 months ago

I've started compaction with following options:

compact --from /db/ --to /mnt/sata1/compact --log-dir /mnt/sata1/logs 
2024-08-16T16:44:58.905Z [Debug] [] Latest Common BlockHeight: 5045607
2024-08-16T16:44:58.905Z [Debug] [] Earliest Common BlockHeight: 852054
2024-08-16T16:44:59.079Z [Debug] [] Compaction target blockheight is: 5044607
2024-08-16T16:44:59.079Z [Debug] [] targetBlockHeight: 5044607

The first hours saw swift progress through the tables when monitoring the log file. The rocksdb compaction seems to be completed. But sqlite is now halting for a week already on this message: 2024-08-16T18:14:12.141Z [Info] [chain=0|component=compaction] Inserting compacted rows into table SYS:Pacts (adding the full log below)

The new sqlite for chain 0 is still growing but very slowly, it might be adding only new data from the running node. Compaction is not starting the next chain sqlite db.

I tried stopping the node for a while. Another possible solution I can't try right away is using SSD for the destination, its currently on a HD.

$ du -ac --max-depth=2 compact/
26584824        compact/0/sqlite
38985404        compact/0/rocksDb
65570232        compact/0
65570236        compact/
65570236        totaal

chain-0.log rocksDb.log

Are there any options you suggest to add that might improve my results?

chessai commented 2 months ago

You might want to try performing compaction on a machine with better iops. I'm not sure if this is a cloud machine or local, but SYS:Pacts in particular really needs high iops. I definitely don't recommend performing it on a regular HD.

chessai commented 2 months ago

Note that we will have documentation coming out soon detailing the requirements for compaction.

raduciobanu22 commented 2 months ago

@chessai I'm currently running it on a 4-cpu 8GB ram DigitalOcean cloud machine, writing directly to an SSD. RocksDB finished in about 1 hour, sqlite chain 0 db is at 2.4GB after roughly 11 hours or so. Sounds normal?

chessai commented 2 months ago

@chessai I'm currently running it on a 4-cpu 8GB ram DigitalOcean cloud machine, writing directly to an SSD. RocksDB finished in about 1 hour, sqlite chain 0 db is at 2.4GB after roughly 11 hours or so. Sounds normal?

Yeah, that sounds normal. SYS:Pacts is really a doozie. I recommend scaling up the IOPs temporarily for the compaction job. SSD isn't the only thing when it comes to cloud, it has to be configured to deliver higher speed. If that doesn't work, I recommend running it on consumer hardware rather than cloud machines, and then just replicating to where you need. On my local dev machine, the entire thing finishes in under an hour. On the cloud machines I tested (similar specs to yours) it took ~18-20 hours.

raduciobanu22 commented 2 months ago

Running it locally is a bit of a challenge due to the size of the DB. Will update on the progress. Thanks for the reply!

On Tue, Aug 27, 2024 at 10:35 chessai @.***> wrote:

@chessai https://github.com/chessai I'm currently running it on a 4-cpu 8GB ram DigitalOcean cloud machine, writing directly to an SSD. RocksDB finished in about 1 hour, sqlite chain 0 db is at 2.4GB after roughly 11 hours or so. Sounds normal?

Yeah, that sounds normal. SYS:Pacts is really a doozie. I recommend scaling up the IOPs temporarily for the compaction job. SSD isn't the only thing when it comes to cloud, it has to be configured to deliver higher speed. If that doesn't work, I recommend running it on consumer hardware rather than cloud machines, and then just replicating to where you need. On my local dev machine, the entire thing finishes in under an hour. On the cloud machines I tested (similar specs to yours) it took ~18-20 hours.

— Reply to this email directly, view it on GitHub https://github.com/kadena-io/chainweb-node/issues/1989#issuecomment-2311453107, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADYZQ5YYT2ISDHS7HUAHTRDZTPQVRAVCNFSM6AAAAABNBNFM42VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJRGQ2TGMJQG4 . You are receiving this because you commented.Message ID: @.***>

raduciobanu22 commented 2 months ago

@chessai Good news, compaction is done! Different question: after restarting the node I am seeing these kind of logs:

chainweb-1  | 2024-08-28T04:43:29.834Z [Error] [chainwebVersion=mainnet01|peerId=IDjkjc|port=1789|host=*****|chain=17|type=ChainwebApp] pact-service failed: {"tag":"FullHistoryRequired","contents":{"_fullHistoryRequiredEarliestBlockHeight":5071966,"_fullHistoryRequiredGenesisHeight":852054}}. Restarting ...
C"_fullHistoryRequiredGenesisHeight":0}},"message":"Your node has been configured to require the full Pact history; however, the full history is not available. Perhaps you have compacted your Pact state?"}

Is there a config that needs to be adjusted?

chessai commented 2 months ago

@chessai Good news, compaction is done! Different question: after restarting the node I am seeing these kind of logs:

chainweb-1  | 2024-08-28T04:43:29.834Z [Error] [chainwebVersion=mainnet01|peerId=IDjkjc|port=1789|host=****5|chain=17|type=ChainwebApp] pact-service failed: {"tag":"FullHistoryRequired","contents":{"_fullHistoryRequiredEarliestBlockHeight":5071966,"_fullHistoryRequiredGenesisHeight":852054}}. Restarting ...
C"_fullHistoryRequiredGenesisHeight":0}},"message":"Your node has been configured to require the full Pact history; however, the full history is not available. Perhaps you have compacted your Pact state?"}

Is there a config that needs to be adjusted?

You need --no-full-historic-pact-state command line flag, or fullHistoricPactState: false in the config yaml.

raduciobanu22 commented 2 months ago

@chessai Good news, compaction is done! Different question: after restarting the node I am seeing these kind of logs:

chainweb-1  | 2024-08-28T04:43:29.834Z [Error] [chainwebVersion=mainnet01|peerId=IDjkjc|port=1789|host=****5|chain=17|type=ChainwebApp] pact-service failed: {"tag":"FullHistoryRequired","contents":{"_fullHistoryRequiredEarliestBlockHeight":5071966,"_fullHistoryRequiredGenesisHeight":852054}}. Restarting ...
C"_fullHistoryRequiredGenesisHeight":0}},"message":"Your node has been configured to require the full Pact history; however, the full history is not available. Perhaps you have compacted your Pact state?"}

Is there a config that needs to be adjusted?

You need --no-full-historic-pact-state command line flag, or fullHistoricPactState: false in the config yaml.

Thanks! Restarted and it seems fine, just needs to catch up now.

trendzetter commented 2 months ago

OK, I finally managed to get enough free SSD space to attempt again. At time of writing I got a compacted db with default options at 74 GB.

chessai commented 2 months ago

@trendzetter @raduciobanu22 Out of curiosity, which release distribution to get the compact tool? Docker, or Ubuntu Binary?

raduciobanu22 commented 2 months ago

I used the Ubuntu binary cause compact tool is not available in the Docker image.

On Fri, Aug 30, 2024 at 23:05 chessai @.***> wrote:

@trendzetter https://github.com/trendzetter @raduciobanu22 https://github.com/raduciobanu22 Out of curiosity, which release distribution to get the compact tool? Docker, or Ubuntu Binary?

— Reply to this email directly, view it on GitHub https://github.com/kadena-io/chainweb-node/issues/1989#issuecomment-2321548038, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADYZQ52YUCDN6SHBX65DVQDZUCC45AVCNFSM6AAAAABNBNFM42VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRRGU2DQMBTHA . You are receiving this because you were mentioned.Message ID: @.***>

chessai commented 2 months ago

I used the Ubuntu binary cause compact tool is not available in the Docker image.

On Fri, Aug 30, 2024 at 23:05 chessai @.***> wrote:

@trendzetter https://github.com/trendzetter @raduciobanu22 https://github.com/raduciobanu22 Out of curiosity, which release distribution to get the compact tool? Docker, or Ubuntu Binary?

— Reply to this email directly, view it on GitHub https://github.com/kadena-io/chainweb-node/issues/1989#issuecomment-2321548038, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADYZQ52YUCDN6SHBX65DVQDZUCC45AVCNFSM6AAAAABNBNFM42VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRRGU2DQMBTHA . You are receiving this because you were mentioned.Message ID: @.***>

Did you docker pull? It should be in the latest image.

raduciobanu22 commented 2 months ago

I used the Ubuntu binary cause compact tool is not available in the Docker image. On Fri, Aug 30, 2024 at 23:05 chessai @.***> wrote:

@trendzetter https://github.com/trendzetter @raduciobanu22 https://github.com/raduciobanu22 Out of curiosity, which release distribution to get the compact tool? Docker, or Ubuntu Binary? — Reply to this email directly, view it on GitHub #1989 (comment), or unsubscribe https://github.com/notifications/unsubscribe-auth/ADYZQ52YUCDN6SHBX65DVQDZUCC45AVCNFSM6AAAAABNBNFM42VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRRGU2DQMBTHA . You are receiving this because you were mentioned.Message ID: @.***>

Did you docker pull? It should be in the latest image.

It's not in 2.25.1

trendzetter commented 2 months ago

i'm using the ubuntu 22.04 binaries