EOSIO / eos

An open source smart contract platform
https://developers.eos.io/manuals/eos
MIT License
11.27k stars 3.6k forks source link

database dirty flag set (likely due to unclean shutdown): replay required #4742

Closed hyqsfaf closed 5 years ago

hyqsfaf commented 6 years ago

Don't want to delete the data and resynchronize. How to change it.

taokayan commented 6 years ago

please use --hard-replay

lcgogo commented 6 years ago

You can relpay the chain with "--replay-blockchain --hard-replay-blockchain" when start the eos like the comments in https://github.com/EOSIO/eos/issues/4002 However, the replay will take 3 hours for whole blocks now (6,500,000 +), be patient.

You should stop docker with -t 300 becasue default 10 secs is not enough to stop eos docker which will cause dirty flag. You can review some comments in https://github.com/EOSIO/eos/issues/4462

BTW: Be careful when you docker start WITHOUT "--replay-blockchain --hard-replay-blockchain" at the second docker start.

BTW2: Update to eos 1.1.0 will speed up sync.

ridewindx commented 6 years ago

The "--replay-blockchain" or "--hard-replay-blockchain" option takes too long for replaying.

Should we add an option to allow the nodeos backups the shared_memory.bin periodically, such as one time per hour? In this way, the increased state will be synced in a short time.

jgiszczak commented 6 years ago

Replaying takes considerably less time with v1.1. You can expect 1 million blocks in 20 minutes, for a total synch time of barely an hour to mainnet.

bitrocks commented 6 years ago

Same problem. My machine has around 40 GB memory, and the data synced

> du -sh *
4.0K    README.md
12K Wallet
5.8G    blocks
4.0K    cleos.sh
8.0K    config.ini
4.0K    genesis.json
4.0K    nodeos.pid
36K scripts
4.0K    start.sh
32G state
4.0K    stderr.txt
0   stdout.txt
4.0K    stop.sh
------
sum:   38G

I have history plugin and filter-on = * configuration set, but the usual mem usage is about 400 Mb. So before doing replay I want to make sure whether the error means I need to scale my machine?

aelbuni commented 6 years ago

@ridewindx I totally agree with adding an automatic backup option, as the --reply-blockchain and --hard-replay-blockchain are definitely taking hours to sync 1 mil blocks in my test server (8 GB Droplet) in DigitalOcean; we should facilitate the resyncing speed at least for constrained analytic nodes which might face unclean exits.

vimalbera92 commented 6 years ago

Agree. There should be some mechanism to reinstate state of blockchain from where node is crashed. Or there will be regular back up cycle who does this job after every 1000 blocks or something.

Just giving my 2 cents.

wsdfz commented 6 years ago

Agree. Nodeos in my server always stopped and become dirty. Replay it take a long hours each time.

csquan commented 6 years ago

--hard-replay or --replay-blockchain does not work,how to solve this..?

csquan commented 6 years ago

after many times fail,do not know what to do.

csquan commented 6 years ago

can anyone answer this?

csquan commented 6 years ago

when use --replay options,it delete state ,use --hard-replay,it rescontruct,but all this does not make use.is this a bug?

csquan commented 6 years ago

none?

jgiszczak commented 6 years ago

@cquan What are the symptoms of failure? Paste error logs.

aelbuni commented 6 years ago

@csquan make sure your configuration for state DB size can accommodate the size of the entire blockchain.

In your config.ini file you submit with nodeos command, make sure to increase the size of your state db in MB to a sufficient value:

For instance: chain-state-db-size-mb = 10490 (~=10 GB) # This is not sufficient for the current size of the blockchain

csquan commented 6 years ago

@aelbuni chain-state-db-size-mb in my config.ini is set about 64G,last time is set 8G,and state is above 8G,so nodes abort,is unclean shutdown,then I change to 64G.

csquan commented 6 years ago

when add --replay opts,it just been killed.

csquan commented 6 years ago

then with no opts,log show “existing block log, attempting to replay 375171 blocks”,after several hours,it just killed.

mawenpeng commented 6 years ago

This happens now and again. It interrupts nodeos, and then interrupts DAPPs. Simply replay the blockchain is not a good option, and it takes a long time. Need a solution to make nodeos robust.

wsdfz commented 6 years ago

My server OOM again. It worked fine in several days, and after check the memory logs, it appeared the free mem dropped from 190M to 50M within 15 minutes(That's because it log the memory every 15 minutes). Look like when some codes run, they will eat up all memory in a short time.

[1002343.852391] lowmem_reserve[]: 0 0 13036 13036 [1002343.852394] Node 0 Normal free:55248kB min:55472kB low:69340kB high:83208kB active_anon:13033408kB inactive_anon:228kB active_file:1780kB inactive_file:6696kB unevictable:0kB isolated(anon):0kB isolated(file):148kB present:13631488kB managed:13352544kB mlocked:0kB dirty:0kB writeback:196kB mapped:236kB shmem:468kB slab_reclaimable:75780kB slab_unreclaimable:32784kB kernel_stack:1664kB pagetables:35044kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:13946 all_unreclaimable? yes

mawenpeng commented 6 years ago

Most of the time, the replay fails after hours, I have to clean up the data directory and restart from the scratch.

brandedux commented 6 years ago

Albeit a truly unacceptable solution --delete-all-blocks. It's fast, haha.

Before replaying and / or deleting. Make sure you check your ports and kill all related processes. I have found this usually enables me restart nodeos. Find associated nodeos PID lsof -i Replace PID with the PID number associated with nodeos pkill PID

beykery commented 6 years ago

@vimalbera92 Agree .

oslivan commented 6 years ago

After 11 hour, replay is not finished.

halsaphi commented 5 years ago

Closed as refers to old version of code. Please refer to latest code and documentation: https://github.com/EOSIO https://developers.eos.io/ If problem persists with latest version please raise new issue.

Yucheng123 commented 5 years ago

After 3 days ,hard-replay is not finished

oslivan commented 5 years ago

You can see which block is currently being replayed in the log. It took me 4~5 days to replay it.

Yucheng123 commented 5 years ago

after version 1.4,eos has no need to replay,snapshot instead url: https://github.com/EOSIO/eos/pull/5956:

matthewdarwin commented 5 years ago

If you want the full history, then you still need to replay.

matthewdarwin commented 5 years ago

I recommend to take regular backups (daily) in case it gets corrupted.

mhinc14 commented 5 years ago

For anyone following the tutorial on developers.eos.io the above solution worked for me. I did the following `nodeos -e -p eosio \ --plugin eosio::producer_plugin \ --plugin eosio::chain_api_plugin \ --plugin eosio::http_plugin \ --plugin eosio::history_plugin \ --plugin eosio::history_api_plugin \ --access-control-allow-origin='*' \ --contracts-console \ --http-validate-host=false \ --verbose-http-errors --replay-blockchain --hard-replay-blockchain

nodeos.log 2>&1 &`

mosjin commented 5 years ago

@mhinc14 Juse like your ideas, I used command below, and it works. My box is CentOS7.6 with recent eos github codes.

nodeos -e -p eosio --plugin eosio::producer_plugin --plugin eosio::chain_api_plugin --plugin eosio::http_plugin --plugin eosio::history_plugin --plugin eosio::history_api_plugin --access-control-allow-origin='*' --contracts-console --http-validate-host=false

rahulbhangre commented 5 years ago

i want to start all process again how i can delet everything and start again?

matthewdarwin commented 5 years ago

add "--delete-all-blocks" to your command.