DB::flush_count overflow

erasmospunk commented 7 years ago

I got a case where the DB::flush_count went over 65535 and a full database resync was needed.

Normally the flush_count would overflow after about 455 days in Bitcoin if we were indexing a fully synced node. However this is not the case if the node is not fully synced and ElectrumX manages to catch up the full node. In that scenario ElectrumX will quickly exhaust the 16bit space of the flush_count variable as it flushes to disk after each block.

Also several coins have shorter block intervals thus hitting the limit in as little as a couple of months.

schildbach commented 4 years ago

I'm now hit by this issue too (ElectrumX 1.14.0 running inside Docker). Is there now an easy way to fix this?

Answering to myself: I ran docker-compose run electrumx /bin/sh, then inside the container I executed /electrumx/electrumx_compact_history. It appears it has fixed the database problem.

pandaatrail commented 4 years ago

@schildbach as stated before by other guys, you need to schedule this to run "often".

By "often", I mean: it will depend on the network you'll be using (testnet, mainnet...)

In an enterprise context, on testnet, we need to schedule a compact history every 2 weeks but we do this every week, to keep it consistent with mainnet, which needs to be compacted every week to be sure it's gonna keep on living without any crash.

This number may vary in the future, as it already varied. The biggest the blockchain the more often you need to run it ? That's what has been said previously in this thread. Let's hope we won't have to do this every day :p

Note that you can directly use docker-compose exec electrumx electrumx_compact_history without the need to enter the service manually. This will help with scripting and cron.

benma commented 4 years ago

Afaik you have to stop the server before compacting. It's not nice to have to schedule downtime regularly. Compacting can also take quite a long time too.

Imho this should really be fixed, as it is a hassle to deal with and many people run into this by surprise.

On Tue, Apr 28, 2020, 18:22 pandaatrail notifications@github.com wrote:

@schildbach https://github.com/schildbach as stated before by other guys, you need to schedule this to run "often".

By "often", I mean: it will depend on the network you'll be using (testnet, mainnet...)

In an enterprise context, on testnet, we need to schedule a compact history every 2 weeks but we do this every week, to keep it consistent with mainnet, which needs to be compacted every week to be sure it's gonna keep on living without any crash.

This number may vary in the future, as it already varied. The biggest the blockchain the more often you need to run it ? That's what has been said previously in this thread. Let's hope we won't have to do this every day :p

Note that you can directly use docker-compose exec electrumx electrumx_compact_history without the need to enter the service manually. This will help with scripting and cron.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kyuupichan/electrumx/issues/185#issuecomment-620711670, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJLEM6TULROFKFCZS5O5RDRO37F7ANCNFSM4DLNIPCA .

nyetwurk commented 4 years ago

Why?

Because if you have to do anything inside the container while the process is stopped, you have to do it in a side car launched with the exact same parameters, env, and volumes.

If you stop PID 1 in a docker container, the container dies.

Yet another reason I really dislike docker.

benma commented 4 years ago

@nyetwurk just use the same docker run ... but with a different entrypoint or cmd. You have to supply the same env/params with or without docker, so really there is no difference.

nyetwurk commented 4 years ago

@nyetwurk just use the same docker run ... but with a different entrypoint or cmd. You have to supply the same env/params with or without docker, so really there is no difference.

For a complex container started with a deployment tool like ansible that becomes a total mess. Docker still inexpicably lacks a reliable way to convert docker inspect to a docker run command line though there are third party hack tools that do a passable job on occasion.

fujicoin commented 4 years ago

If you want to get rid of regular database compaction tasks, use the electrs server. Electrs implements automatic database compaction. It has been a month since it started up, but it is operating stably.

OracolXor commented 4 years ago

Hello,

I am mstill doing manual compaction .Is there a tutorial on how to implement the change? Or can you help?

Thanks

Adrian

From: FujiCoin [mailto:notifications@github.com] Sent: June 13, 2020 7:14 AM To: kyuupichan/electrumx electrumx@noreply.github.com Cc: OracolXor oracol@oracol.mobi; Comment comment@noreply.github.com Subject: Re: [kyuupichan/electrumx] DB::flush_count overflow (#185)

If you want to get rid of regular database compaction tasks, use the electrs server. Electrs implements automatic database compaction. It has been a month since it started up, but it is operating stably.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kyuupichan/electrumx/issues/185#issuecomment-643608840 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AGYMHRM2VKXAOPQJIOYTVWLRWNNP5ANCNFSM4DLNIPCA . https://github.com/notifications/beacon/AGYMHRPQIKARWMDC6S3D3XDRWNNP5A5CNFSM4DLNIPCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEZOLCCA.gif

fujicoin commented 4 years ago

@OracolXor

Electrs github: https://github.com/romanz/electrs https://github.com/romanz/electrs/blob/master/doc/usage.md

If you use Docker-based installation, you can compile it without fail, but it's a little troublesome. Manual installation may not compile depending on the OS version. Ubuntu 18.04 will compile without problems. If that doesn't work, there is a way to get the binaries compiled with docker out of the container.

vlddm commented 4 years ago

Failed to do history compaction with same error that I got for electrumx daemon using 1.15.0 Any ways to fix without syncing from scratch?

INFO:root:Starting history compaction...
INFO:electrumx.server.db.DB:switching current directory to /data
INFO:electrumx.server.db.DB:using leveldb for DB backend
INFO:electrumx.server.db.DB:opened UTXO DB (for sync: True)
INFO:electrumx.server.db.DB:UTXO DB version: 8
INFO:electrumx.server.db.DB:coin: BitcoinSegwit
INFO:electrumx.server.db.DB:network: mainnet
INFO:electrumx.server.db.DB:height: 648,180
INFO:electrumx.server.db.DB:tip: 0000000000000000000821bfdee3af8ac2ba17183e7b0d7f39c527b43477e8c3
INFO:electrumx.server.db.DB:tx count: 568,016,657
INFO:electrumx.server.db.DB:flushing DB cache at 1,200 MB
INFO:electrumx.server.history.History:history DB version: 1
INFO:electrumx.server.history.History:flush count: 65,535
Traceback (most recent call last):
  File "/home/electrumx/.local/bin/electrumx_compact_history", line 73, in main
    loop.run_until_complete(compact_history())
  File "/usr/local/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
    return future.result()
  File "/home/electrumx/.local/bin/electrumx_compact_history", line 51, in compact_history
    await db.open_for_compacting()
  File "/home/electrumx/.local/lib/python3.7/site-packages/electrumx/server/db.py", line 150, in open_for_compacting
    await self._open_dbs(True, True)
  File "/home/electrumx/.local/lib/python3.7/site-packages/electrumx/server/db.py", line 147, in _open_dbs
    await self._read_tx_counts()
  File "/home/electrumx/.local/lib/python3.7/site-packages/electrumx/server/db.py", line 115, in _read_tx_counts
    assert len(tx_counts) == size
AssertionError
CRITICAL:root:History compaction terminated abnormally

SomberNight commented 4 years ago

@vlddm I think that AssertionError suggests that your DB got corrupted. Unfortuntely you probably need to sync from scratch.

sancoder commented 4 years ago

Running electrumx in docker; and after upgrading to 1.15 I got bumped into this bug.

What I did to workaround is these steps:

while container running I opened a shell into container sudo docker exec -it <container-id> sh (to get container id one can run sudo docker ps)
edit /electrumx/electrumx_server file vi /electrumx/electrumx_server
added 3 lines of code prior to the line if __name__ == ...

    import os, time
    while os.path.exists('/data/stop4service'):
        time.sleep(10)

stopped the container
created the file named 'stop4service' on the host where container's /data folder is mapped
started the container
opened a shell into container sudo docker exec -it <container-id> sh
now that container is running but doing nothing, execute compact history command python3 /electrumx/electrumx_compact_history
after history compact command finished it's ok to remove the 'stop4service' file - the container will resume

If we're gonna live with this compact_history stuff it'd be great to have such a pause for service merged into the electrumx_server script.

SomberNight commented 3 years ago

I am considering making some changes to the db schema, and as part of that have been looking at this issue.

Currently the history db schema is:

# Key: address_hashX + flush_id
# Value: sorted "list" of tx_nums in history of hashX

I'm thinking it could be changed to:

# Key: address_hashX + tx_num
# Value: <null>

This would completely do away with flush_id and compaction; although I am not sure how it would affect performance. I quite like it that items would be constant size.

Re db size on disk, I guess it would increase it somewhat, hard to tell by how much though as it depends on address reuse and whether the db is compacted:

worst case for current schema is if all tx_nums are in separate items (without compaction): 18 bytes per tx
"average" (guess) case for current schema - two tx_nums for each address (fund-then-spend): 11.5 bytes per tx
best hypothetical case for current schema is if all txs in the chain touch the same address, compacted: 5 bytes per tx
with new schema, there would be no variance: 16 bytes per tx

What do you think @kyuupichan ?

To be clear, this is about the history db, so for current reference size (depends on when last compaction was though!), see:

$ du -h .
18G     ./meta
4.4G    ./utxo
33G     ./hist
55G     .

kyuupichan commented 3 years ago

Seems worth trying, let me know how it goes.

oven8Mitts commented 3 years ago

I get this problem when i run the script like this (Environment variables are exported from electrumx.conf): python3 compact_history.py

Error: Traceback (most recent call last): File "compact_history.py", line 33, in from server.env import Env ModuleNotFoundError: No module named 'server'

In the script there are these both lines: from server.env import Env from server.db import DB

What am I doing wrong?

I also get this error, not sure what to do.

Running in Debian w/ systemD, with LevelDb. Piping in environment with this command:

export $(cat /etc/electrumx.conf | xargs) && python3.7 /var/lib/electrumx/elect_compact.py Also tried a similar thing in systemd with environment files, which resulted in the same error.

It seems like something is missing to define server.env, or possibly must I change server.env to something that matches my configuration?

The electrumx server operates and executes normally otherwise, outside of database compaction issues.

oven8Mitts commented 3 years ago

@useramuser

I was able to resolve this issue with the script mentioned in this post: https://github.com/spesmilo/electrumx/issues/88#issuecomment-752900904

It seems that the script in use above has since changed and is no longer applicable to our environment.

ronaldstoner commented 3 years ago

I encountered this same struct 'H' format 65536 bug with a testnet BitcoinSegwit setup. The compact_history script completed, but I see the same error when restarting the service. EDIT: Re-running it a second time seems to have fixed all errors and the binary is running.

nyetwurk commented 3 years ago

May 15, 2017

nyetwurk commented 3 years ago

One possible terrible hack (since it is pretty clear this isn't going to be fixed): run compact on startup if the db exists. That way, if it crashes, it will just run the compact script when it is run agian.

nyetwurk commented 3 years ago

Again, why not build compaction into electrumx itself? So at least it can be done regularly (and possibly even automatically on startup) so it can continue to be run in docker.

kyuupichan commented 3 years ago

Do you have a patch?

nyetwurk commented 3 years ago

Do you have a patch?

I would if https://github.com/spesmilo/electrumx/issues/88#issuecomment-752900904 worked

And you wouldn't like it much, because it would likely be a hack to the docker container to just run it unconditionally on start up via a bash script

kyuupichan / electrumx

DB::flush_count overflow #185