kyuupichan / electrumx

Alternative implementation of spesmilo/electrum-server
Other
725 stars 733 forks source link

DB::flush_count overflow #185

Open erasmospunk opened 7 years ago

erasmospunk commented 7 years ago

I got a case where the DB::flush_count went over 65535 and a full database resync was needed.

Normally the flush_count would overflow after about 455 days in Bitcoin if we were indexing a fully synced node. However this is not the case if the node is not fully synced and ElectrumX manages to catch up the full node. In that scenario ElectrumX will quickly exhaust the 16bit space of the flush_count variable as it flushes to disk after each block.

Also several coins have shorter block intervals thus hitting the limit in as little as a couple of months.

kyuupichan commented 7 years ago

I was surprised it took this long for a complaint. I did some work on this over a month ago but it is incomplete; I will pick it up again soon.

petrkr commented 7 years ago

Is that this problem?

May 27 17:07:49 bohol electrumx_server.py[1755]: ERROR:Controller:Traceback (most recent call last):
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/controller.py", line 157, in on_future_done
May 27 17:07:49 bohol electrumx_server.py[1755]: future.result()
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/block_processor.py", line 201, in main_loop
May 27 17:07:49 bohol electrumx_server.py[1755]: await task()
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/block_processor.py", line 216, in first_caught_up
May 27 17:07:49 bohol electrumx_server.py[1755]: await self.controller.run_in_executor(self.flush, True)
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/controller.py", line 137, in run_in_executor
May 27 17:07:49 bohol electrumx_server.py[1755]: return await self.loop.run_in_executor(None, func, *args)
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 55, in run
May 27 17:07:49 bohol electrumx_server.py[1755]: result = self.fn(*self.args, **self.kwargs)
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/block_processor.py", line 366, in flush
May 27 17:07:49 bohol electrumx_server.py[1755]: self.flush_history(self.history)
May 27 17:07:49 bohol electrumx_server.py[1755]: File "/usr/local/lib/python3.6/site-packages/electrumx-1.0.10-py3.6.egg/server/db.py", line 437, in flush_history
May 27 17:07:49 bohol electrumx_server.py[1755]: flush_id = pack('>H', self.flush_count)
May 27 17:07:49 bohol electrumx_server.py[1755]: struct.error: 'H' format requires 0 <= number <= 65535
bauerj commented 7 years ago

Yes.

uruk-hai-21 commented 7 years ago

I'm not a programmer. Practically what is the problem? There is a risk to lose the wallet?

kyuupichan commented 7 years ago

ElectrumX is not a wallet. It's a public blockchain index. Your question doesn't make a lot of sense.

uruk-hai-21 commented 7 years ago

I'm sorry, I made a mistake! I have confused Electrumx with "Electrum Bitcoin Wallet" (electrum.org)

kyuupichan commented 7 years ago

@erasmospunk - can you try the compact-hist script from 2f26e81. Once I have a couple of reports it works I'll have it run in the background of a running server.

erasmospunk commented 7 years ago

@kyuupichan I didn't keep the databases that had this issue but I run it on Dash that is near the limit:

INFO:root:Starting history compaction...
INFO:DB:switching current directory to /path/to/dash/electrumxdb
INFO:DB:using rocksdb for DB backend
INFO:DB:opened DB for serving
INFO:DB:software version: ElectrumX 1.0.11
INFO:DB:DB version: 5
INFO:DB:coin: Dash
INFO:DB:network: mainnet
INFO:DB:height: 691,997
INFO:DB:tip: 000000000000128ef39c8088a08b53674c1fe26750a85ff64a8e51ef643590a1
INFO:DB:tx count: 3,032,454
INFO:DB:flush count: 63,405
INFO:DB:deleted 1,547 stale undo entries
INFO:DB:reorg limit is 200 blocks
INFO:DB:history compaction: wrote 568,616 rows (8.0 MB), removed 1,172,466 rows, largest: 4, 8.1% complete
INFO:DB:history compaction: wrote 584,118 rows (8.0 MB), removed 1,198,997 rows, largest: 4, 16.5% complete
INFO:DB:hashX 06dcc60fe4d28abe8abd33 is large: 97,931 entries across 8 rows
INFO:DB:history compaction: wrote 523,761 rows (8.0 MB), removed 1,093,185 rows, largest: 8, 23.9% complete
INFO:DB:history compaction: wrote 574,894 rows (8.0 MB), removed 1,176,844 rows, largest: 8, 32.1% complete
INFO:DB:hashX 988d8582fa6f328c94e95a is large: 63,793 entries across 6 rows
INFO:DB:hashX 5987058f7f58d339d76e5c is large: 97,607 entries across 8 rows
INFO:DB:history compaction: wrote 518,220 rows (8.0 MB), removed 1,079,053 rows, largest: 8, 39.5% complete
INFO:DB:history compaction: wrote 574,615 rows (8.0 MB), removed 1,180,103 rows, largest: 8, 47.7% complete
INFO:DB:history compaction: wrote 582,429 rows (8.0 MB), removed 1,188,457 rows, largest: 8, 56.0% complete
INFO:DB:history compaction: wrote 571,884 rows (8.0 MB), removed 1,172,848 rows, largest: 8, 64.2% complete
INFO:DB:hashX 527250e83e6e3a7452f5a6 is large: 51,791 entries across 5 rows
INFO:DB:hashX 6705914577688925e4a8b0 is large: 58,488 entries across 5 rows
INFO:DB:history compaction: wrote 520,474 rows (8.0 MB), removed 1,122,776 rows, largest: 8, 71.6% complete
INFO:DB:history compaction: wrote 574,083 rows (8.0 MB), removed 1,188,666 rows, largest: 8, 79.8% complete
INFO:DB:hashX ad744e1ecb43f49811decc is large: 89,738 entries across 8 rows
INFO:DB:history compaction: wrote 552,658 rows (8.0 MB), removed 1,143,296 rows, largest: 8, 87.7% complete
INFO:DB:history compaction: wrote 552,613 rows (8.0 MB), removed 1,150,632 rows, largest: 8, 95.5% complete
INFO:DB:history compaction: wrote 313,102 rows (4.3 MB), removed 641,720 rows, largest: 8, 100.0% complete
INFO:root:History compaction complete

The db was 743M before compaction and 589M after that.

I am also running Digibyte that will hit the limit in about a week.

erasmospunk commented 7 years ago

@kyuupichan update for Dash and Digibyte that exceeded the 65535 flush count, the compaction worked correctly.

kyuupichan commented 7 years ago

Thanks! I'll get something done so it isn't a separate script.

zilveer commented 6 years ago

@kyuupichan how do I run compact-hist ? do I run python3 compact_history.py in the db directory ?

erasmospunk commented 6 years ago

@zilveer yes, but remember to set the same environment variables as when you run electrumx normally

fujicoin commented 6 years ago

Hi kyuupichan,

Running the compact_history script on a regular basis takes time and effort. Since Fujicoin's block interval is 1 minute, it needs to be executed once every 1.5 months. Can you improve the program to run automatically? Or can you handle it in the exectrumX code?

Thanks, motty

gruve-p commented 6 years ago

Same issue for Groestlcoin. We have to run the compact_history script every 1.5 months (blocktime is 1 min).

@erasmospunk I heared from Jackie that you are adding Groestlcoin into Coinomi. Since that will need electrumx, it might affect Coinomi also.

zilveer commented 6 years ago

@kyuupichan is there any automated process for this now?

zilveer commented 6 years ago

@erasmospunk @kyuupichan how do I set same environment variables as I have for ElectrumX? I have the environment variables located at /etc/electrumx.conf , I guess I should run something else then only? python3 /db/compact_history.py

any help is appreciated. regards

kyuupichan commented 6 years ago

@zilveer no, you need to run the script manually

luggs-co commented 6 years ago

I just hit this on my Litecoin electrumx. Running compact_history.py resolved the issue.

INFO:root:Starting history compaction...
INFO:DB:switching current directory to /coins/ex-ltc
INFO:DB:using rocksdb for DB backend
INFO:DB:opened DB for serving
INFO:DB:software version: ElectrumX 1.2.1
INFO:DB:DB version: 6
INFO:DB:coin: Litecoin
INFO:DB:network: mainnet
INFO:DB:height: 1,361,447
INFO:DB:tip: 02460c78e710b3c6ab23befaaf655d676ede661cde441791598cb71312f352ca
INFO:DB:tx count: 20,614,241
INFO:DB:flush count: 65,535
INFO:DB:reorg limit is 800 blocks
INFO:DB:history compaction: wrote 482,955 rows (8.0 MB), removed 987,267 rows, largest: 3, 1.9% complete
INFO:DB:hashX a072489792368675f65305 is large: 56,801 entries across 5 rows
INFO:DB:history compaction: wrote 464,907 rows (8.0 MB), removed 932,056 rows, largest: 5, 3.8% complete
INFO:DB:history compaction: wrote 454,804 rows (8.0 MB), removed 937,715 rows, largest: 5, 5.6% complete
INFO:DB:history compaction: wrote 450,735 rows (8.0 MB), removed 929,326 rows, largest: 5, 7.4% complete
INFO:DB:hashX 2e3de1a540181006544315 is large: 70,983 entries across 6 rows
INFO:DB:hashX b1cd2adefcfb5489fe9e17 is large: 72,320 entries across 6 rows
INFO:DB:history compaction: wrote 456,680 rows (8.0 MB), removed 922,512 rows, largest: 6, 9.3% complete
INFO:DB:hashX 70a2ffae2ff6d4aab8811b is large: 53,440 entries across 5 rows
INFO:DB:history compaction: wrote 446,602 rows (8.0 MB), removed 903,667 rows, largest: 6, 11.1% complete
INFO:DB:history compaction: wrote 464,804 rows (8.0 MB), removed 925,790 rows, largest: 6, 12.9% complete
INFO:DB:history compaction: wrote 462,887 rows (8.0 MB), removed 931,290 rows, largest: 6, 14.8% complete
INFO:DB:hashX 14a49acbbfb25d8085e227 is large: 70,388 entries across 6 rows
INFO:DB:history compaction: wrote 473,612 rows (8.0 MB), removed 950,408 rows, largest: 6, 16.7% complete
INFO:DB:hashX 76682bce5f8d911cacf82d is large: 58,591 entries across 5 rows
INFO:DB:history compaction: wrote 471,765 rows (8.0 MB), removed 957,865 rows, largest: 6, 18.6% complete
INFO:DB:hashX 12c70f1c63c858b1600031 is large: 70,580 entries across 6 rows
INFO:DB:history compaction: wrote 465,127 rows (8.0 MB), removed 938,018 rows, largest: 6, 20.4% complete
INFO:DB:history compaction: wrote 471,832 rows (8.0 MB), removed 954,327 rows, largest: 6, 22.3% complete
INFO:DB:hashX 9e47242a86bd98676f0a3b is large: 54,213 entries across 5 rows
INFO:DB:hashX 612aed6e16ec8f7408ec3b is large: 58,731 entries across 5 rows
INFO:DB:hashX ab48cd40dc608fb60aad3c is large: 71,001 entries across 6 rows
INFO:DB:history compaction: wrote 453,687 rows (8.0 MB), removed 927,053 rows, largest: 6, 24.1% complete
INFO:DB:hashX 4345bbb80db07ad8f98041 is large: 96,875 entries across 8 rows
INFO:DB:history compaction: wrote 426,965 rows (8.0 MB), removed 859,309 rows, largest: 8, 25.9% complete
INFO:DB:history compaction: wrote 477,428 rows (8.0 MB), removed 971,735 rows, largest: 8, 27.8% complete
INFO:DB:history compaction: wrote 471,457 rows (8.0 MB), removed 946,487 rows, largest: 8, 29.7% complete
INFO:DB:hashX 70ae02791360ff114e194c is large: 55,089 entries across 5 rows
INFO:DB:hashX d4b2aaab343cda3f54f04d is large: 280,356 entries across 23 rows
INFO:DB:hashX f1d820315a973e4b43554f is large: 108,325 entries across 9 rows
INFO:DB:history compaction: wrote 385,106 rows (8.0 MB), removed 791,184 rows, largest: 23, 31.2% complete
INFO:DB:hashX 327bd393203f605ae97b51 is large: 122,202 entries across 10 rows
INFO:DB:hashX fe90f22034b04e8bf3c552 is large: 166,835 entries across 14 rows
INFO:DB:history compaction: wrote 407,680 rows (8.1 MB), removed 843,299 rows, largest: 23, 32.8% complete
INFO:DB:history compaction: wrote 483,171 rows (8.0 MB), removed 967,339 rows, largest: 23, 34.8% complete
INFO:DB:hashX aba8b9db4ce1cf087b4959 is large: 85,704 entries across 7 rows
INFO:DB:history compaction: wrote 439,544 rows (8.0 MB), removed 886,004 rows, largest: 23, 36.5% complete
INFO:DB:hashX f9ae44fd1e7bda3c656d5e is large: 63,673 entries across 6 rows
INFO:DB:hashX f4ee90d4c3b6669e315261 is large: 70,803 entries across 6 rows
INFO:DB:history compaction: wrote 422,720 rows (8.0 MB), removed 876,016 rows, largest: 23, 38.2% complete
INFO:DB:history compaction: wrote 498,671 rows (8.0 MB), removed 1,007,629 rows, largest: 23, 40.2% complete
INFO:DB:history compaction: wrote 463,089 rows (8.1 MB), removed 930,923 rows, largest: 23, 42.1% complete
INFO:DB:history compaction: wrote 486,407 rows (8.0 MB), removed 988,294 rows, largest: 23, 44.0% complete
INFO:DB:history compaction: wrote 486,905 rows (8.0 MB), removed 972,226 rows, largest: 23, 46.0% complete
INFO:DB:hashX 460e609124a1b7e6a9db76 is large: 98,251 entries across 8 rows
INFO:DB:history compaction: wrote 456,768 rows (8.0 MB), removed 935,789 rows, largest: 23, 47.8% complete
INFO:DB:history compaction: wrote 503,498 rows (8.0 MB), removed 1,011,865 rows, largest: 23, 49.9% complete
INFO:DB:history compaction: wrote 471,366 rows (8.0 MB), removed 952,125 rows, largest: 23, 51.7% complete
INFO:DB:hashX 0b53a2945e4ce710ed9284 is large: 55,680 entries across 5 rows
INFO:DB:hashX 9f3939d9123e4117a44b87 is large: 70,546 entries across 6 rows
INFO:DB:hashX 517d73a86d55f143c28487 is large: 70,879 entries across 6 rows
INFO:DB:history compaction: wrote 428,641 rows (8.0 MB), removed 856,422 rows, largest: 23, 53.5% complete
INFO:DB:history compaction: wrote 488,172 rows (8.0 MB), removed 986,213 rows, largest: 23, 55.4% complete
INFO:DB:history compaction: wrote 481,533 rows (8.0 MB), removed 973,207 rows, largest: 23, 57.4% complete
INFO:DB:history compaction: wrote 486,085 rows (8.0 MB), removed 992,084 rows, largest: 23, 59.3% complete
INFO:DB:history compaction: wrote 469,129 rows (8.0 MB), removed 944,652 rows, largest: 23, 61.2% complete
INFO:DB:history compaction: wrote 493,875 rows (8.0 MB), removed 1,004,869 rows, largest: 23, 63.2% complete
INFO:DB:history compaction: wrote 482,323 rows (8.0 MB), removed 979,647 rows, largest: 23, 65.1% complete
INFO:DB:history compaction: wrote 479,415 rows (8.0 MB), removed 972,942 rows, largest: 23, 67.0% complete
INFO:DB:history compaction: wrote 466,582 rows (8.0 MB), removed 928,960 rows, largest: 23, 68.9% complete
INFO:DB:history compaction: wrote 500,128 rows (8.0 MB), removed 1,007,263 rows, largest: 23, 70.9% complete
INFO:DB:history compaction: wrote 497,630 rows (8.0 MB), removed 999,766 rows, largest: 23, 72.9% complete
INFO:DB:history compaction: wrote 449,140 rows (8.0 MB), removed 910,120 rows, largest: 23, 74.7% complete
INFO:DB:history compaction: wrote 472,171 rows (8.0 MB), removed 952,386 rows, largest: 23, 76.6% complete
INFO:DB:hashX fdb2e553d5ac12c9f011c4 is large: 65,798 entries across 6 rows
INFO:DB:history compaction: wrote 474,040 rows (8.0 MB), removed 964,055 rows, largest: 23, 78.5% complete
INFO:DB:hashX bd32e44c16f1819dbffaca is large: 52,384 entries across 5 rows
INFO:DB:history compaction: wrote 459,667 rows (8.0 MB), removed 932,956 rows, largest: 23, 80.3% complete
INFO:DB:history compaction: wrote 472,210 rows (8.1 MB), removed 947,540 rows, largest: 23, 82.2% complete
INFO:DB:history compaction: wrote 502,901 rows (8.2 MB), removed 997,117 rows, largest: 23, 84.2% complete
INFO:DB:history compaction: wrote 478,412 rows (8.0 MB), removed 965,630 rows, largest: 23, 86.1% complete
INFO:DB:history compaction: wrote 450,143 rows (8.2 MB), removed 947,434 rows, largest: 23, 88.0% complete
INFO:DB:history compaction: wrote 470,232 rows (8.0 MB), removed 943,633 rows, largest: 23, 89.8% complete
INFO:DB:hashX 0af1de3cc97b0bbd0c6de7 is large: 98,155 entries across 8 rows
INFO:DB:history compaction: wrote 447,901 rows (8.0 MB), removed 907,270 rows, largest: 23, 91.6% complete
INFO:DB:hashX f0364777f30a9a9a3488ef is large: 70,782 entries across 6 rows
INFO:DB:history compaction: wrote 480,832 rows (8.2 MB), removed 982,766 rows, largest: 23, 93.6% complete
INFO:DB:hashX 990364d3177668d0d409f3 is large: 71,746 entries across 6 rows
INFO:DB:history compaction: wrote 462,264 rows (8.0 MB), removed 930,576 rows, largest: 23, 95.4% complete
INFO:DB:hashX de9d8433aecb5447d549f4 is large: 61,524 entries across 5 rows
INFO:DB:hashX d61462346b185bcc0f99f4 is large: 70,530 entries across 6 rows
INFO:DB:history compaction: wrote 454,200 rows (8.0 MB), removed 911,703 rows, largest: 23, 97.2% complete
INFO:DB:hashX 73a0cced83973163ac9dfd is large: 53,210 entries across 5 rows
INFO:DB:history compaction: wrote 488,369 rows (8.1 MB), removed 981,747 rows, largest: 23, 99.2% complete
INFO:DB:history compaction: wrote 197,921 rows (3.2 MB), removed 415,361 rows, largest: 23, 100.0% complete
INFO:root:History compaction complete

Is this safe to run while the server is operational, or does the service need to be stopped (which is what i did this time cause of the error anyways)

kyuupichan commented 6 years ago

It has to be stopped as only one process can open the DB at a time.

luggs-co commented 6 years ago

Thanks for the quick reply.

Now the question is how can we streamline this so an admin doesn't have to do maint on coins every month or so.

nijel commented 6 years ago

I was seeing the CRITICAL:root:ElectrumX server terminated abnormally error on Litecoin as well and running compact_history.py resolved the issue for me. If it's unclear how to integrate this into daemon, it would be great to have at least mention in the docs...

kyuupichan commented 6 years ago

The answer is to do history differently. I'm working on that, but no ETA.

benma commented 6 years ago

Also ran into this for the first time and was quite confused. Good to hear you are working on this!

FMCorz commented 6 years ago

I encountered the same issue with Litecoin.

ERROR:root:task crashed: <Task finished coro=<Controller.serve() done, defined at /electrumx/electrumx/server/controller.py:79> exception=error("'H' format requires 0 <= number <= 65535")>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/aiorpcX-0.8.2-py3.7.egg/aiorpcx/util.py", line 118, in check_task
    task.result()
  File "/electrumx/electrumx/server/controller.py", line 129, in serve
    serve_externally_event.set()
  File "/usr/local/lib/python3.7/site-packages/aiorpcX-0.8.2-py3.7.egg/aiorpcx/curio.py", line 241, in __aexit__
    await self.join(wait=self._wait)
  File "/usr/local/lib/python3.7/site-packages/aiorpcX-0.8.2-py3.7.egg/aiorpcx/curio.py", line 214, in join
    raise task.exception()
  File "/electrumx/electrumx/server/block_processor.py", line 663, in fetch_and_process_blocks
    await self.flush(True)
  File "/electrumx/electrumx/server/block_processor.py", line 334, in flush
    await self.run_in_thread_with_lock(flush)
  File "/electrumx/electrumx/server/block_processor.py", line 190, in run_in_thread_with_lock
    return await asyncio.shield(run_in_thread_locked())
  File "/electrumx/electrumx/server/block_processor.py", line 189, in run_in_thread_locked
    return await run_in_thread(func, *args)
  File "/usr/local/lib/python3.7/site-packages/aiorpcX-0.8.2-py3.7.egg/aiorpcx/curio.py", line 61, in run_in_thread
    return await get_event_loop().run_in_executor(None, func, *args)
  File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/electrumx/electrumx/server/block_processor.py", line 333, in flush
    self.estimate_txs_remaining)
  File "/electrumx/electrumx/server/db.py", line 200, in flush_dbs
    self.flush_history()
  File "/electrumx/electrumx/server/db.py", line 269, in flush_history
    self.history.flush()
  File "/electrumx/electrumx/server/history.py", line 129, in flush
    flush_id = pack_be_uint16(self.flush_count)
struct.error: 'H' format requires 0 <= number <= 65535
WARNING:Controller:received SIGTERM signal, initiating shutdown

However, I didn't realise and restarted the server. No errors were displayed when the server was started manually after that, I only realised that my client couldn't connect. The logs were reading:

INFO:Prefetcher:catching up to daemon height 1,506,052 (2,525 blocks behind)
INFO:BlockProcessor:our height: 1,503,537 daemon: 1,506,052 UTXOs 0MB hist 0MB
INFO:BlockProcessor:processed 10 blocks in 0.1s
INFO:BlockProcessor:processed 500 blocks in 2.4s
INFO:BlockProcessor:processed 844 blocks in 2.3s
INFO:BlockProcessor:processed 469 blocks in 1.9s
INFO:BlockProcessor:processed 702 blocks in 2.3s
INFO:BlockProcessor:caught up to height 1506052
INFO:DB:flushed filesystem data in 0.00s
INFO:BlockProcessor:flushing to DB for a clean shutdown...
INFO:DB:flushed filesystem data in 0.00s

I would suggest to display an error message on startup when the server cannot continue until the database is compacted. The only hint in the logs above was that flushing took 0.00s, but that wasn't helpful identifying the cause.

Compaction was the solution, all I did was shutting down the server and running the following command with the same environment as the server:

./compact_history.py
kyuupichan commented 6 years ago

The exception is being swallowed in the latter case; I will look into it

joesixpack commented 5 years ago

I'm using rocksdb. How to fix to get it working?

image

bauerj commented 5 years ago

No, you're using leveldb, as you can see on the screenshot.

If you want to use rocksdb, just set DB_ENGINE to rocksdb.

vetal20600 commented 5 years ago

how to apply same environment variables while running compact_history.py ?

@kyuupichan maybe you can help? my config is in external file: /home/electrumx/myconf.conf I tried to edit ~/.bash_profile but with no success

kyuupichan commented 5 years ago

@vetal20600 this isn't the place for basic Unix sysadmin stuff. To pass environment variables you can just precede them on the command line:

$ A=B command

sets envvar A to B and runs command, for example. or use envdir from daemontools like I mention in the docs.

grittibaenz commented 5 years ago

wouldn't it be possible to run this function automatically instead of throwing an error message?

Manbearpixel commented 5 years ago

@kyuupichan Thanks for pointing me to this pre-existing reported issue! Was not sure what to search. What kind of support/help do you think you need to possibly have this automated or caught properly?

Interim, would you instead suggest some sort of OS-level CRON job maybe shut the server down, compact the history, and start it up again every other week?

kyuupichan commented 5 years ago

Week? This happens once a year at most on sane coins. If your coin has 1 block every 5 seconds then ....

Manbearpixel commented 5 years ago

The blockchain I am currently running this with generates a new block approx. every minute, each block contains at minimum two transactions (one for generation and one for rewards). Not sure how I hit this so soon, though looking back in the comments here there are other coins with my similar blocktime (and estimated refresh requirement) so this seems to be an issue that could be discussed here and a solution determined :)

I do not see myself proficient in Python, but wouldn't mind attempting to help where possible if you had some context that could be provided as to what the flush count is about and what compact history does.

benma commented 5 years ago

I run ltc testnet and have to compact every ~2 weeks.

In any case, the way it fails is sub-optimal. The server still runs, but does not sync, so clients don't even notice something is off

Would be nice to fix the issue, or at least fail critically with a good error message to point out how to fix it.

The answer is to do history differently. I'm working on that, but no ETA.

Any progress? :)

kyuupichan commented 5 years ago

No. I doubt I will fix this except as part of a major rework. No guarantee on either. Given almost no one contributes anything to the codebase other than new crapcoins, don't hold your breath

Manbearpixel commented 5 years ago

😅 Yeah I see that. Like I said, I'd like to help and contribute but having no experience with Python makes it a little harder to start. I also figure this market doesn't seem very lucrative to want to invest more of your time into this.

All that said, why do you think this could only be fixed with major rework? I mentioned I'd like to help where I can, but would need some guidance on to where to start. The codebase looks nice, but there doesn't seem to be much commentary as to what various methods are doing or their purpose. While I could spend hours or days looking at things and interpreting what you might be doing in a scenario, if I had more details to go off of it would make contributing back easier

cluelessperson commented 5 years ago

weird, happening to me here on testnet

Dec 31 00:51:13 bitcoin electrumx_server[12131]: INFO:BlockProcessor:processed 20 blocks in 0.7s
Dec 31 00:51:14 bitcoin electrumx_server[12131]: INFO:BlockProcessor:processed 14 blocks in 0.3s
Dec 31 00:51:16 bitcoin electrumx_server[12131]: INFO:BlockProcessor:our height: 1,381,416 daemon: 1,450,220 UTXOs 11MB hist 346MB
Dec 31 00:51:17 bitcoin electrumx_server[12131]: INFO:DB:flushed filesystem data in 0.69s
Dec 31 00:51:17 bitcoin electrumx_server[12131]: INFO:BlockProcessor:flushing to DB for a clean shutdown...
Dec 31 00:51:17 bitcoin electrumx_server[12131]: INFO:DB:flushed filesystem data in 0.00s
Dec 31 00:51:17 bitcoin electrumx_server[12131]: INFO:SessionManager:closing down RPC listening servers
Dec 31 00:51:17 bitcoin electrumx_server[12131]: ERROR:root:task crashed: <Task finished coro=<Controller.serve() done, defined at /opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/controller.py:81> exception=error("'H' format requires 0 <= number <= 65535")>
Dec 31 00:51:17 bitcoin electrumx_server[12131]: Traceback (most recent call last):
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/aiorpcx/util.py", line 118, in check_task
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     task.result()
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/controller.py", line 134, in serve
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     await group.spawn(wait_for_catchup())
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/aiorpcx/curio.py", line 231, in __aexit__
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     await self.join()
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/aiorpcx/curio.py", line 200, in join
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     raise task.exception()
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/block_processor.py", line 662, in fetch_and_process_blocks
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     await self.flush(True)
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/block_processor.py", line 335, in flush
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     await self.run_in_thread_with_lock(flush)
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/block_processor.py", line 191, in run_in_thread_with_lock
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     return await asyncio.shield(run_in_thread_locked())
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/block_processor.py", line 190, in run_in_thread_locked
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     return await run_in_thread(func, *args)
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/aiorpcx/curio.py", line 61, in run_in_thread
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     return await get_event_loop().run_in_executor(None, func, *args)
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     result = self.fn(*self.args, **self.kwargs)
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/block_processor.py", line 334, in flush
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     self.estimate_txs_remaining)
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/db.py", line 200, in flush_dbs
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     self.flush_history()
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/db.py", line 269, in flush_history
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     self.history.flush()
Dec 31 00:51:17 bitcoin electrumx_server[12131]:   File "/opt/electrumx_venv/lib/python3.7/site-packages/electrumx/server/history.py", line 129, in flush
Dec 31 00:51:17 bitcoin electrumx_server[12131]:     flush_id = pack_be_uint16(self.flush_count)
Dec 31 00:51:17 bitcoin electrumx_server[12131]: struct.error: 'H' format requires 0 <= number <= 65535
kyuupichan commented 5 years ago

That's not weird at all; it happens to any and all coins once they have enough blocks since the prior compaction

whizz commented 5 years ago

Can confirm the script works as advertised on Bitcoin mainnet. If anyone is interested, this is a oneliner I used to set the proper conf values and run it:

export $(cat /etc/electrumx.conf | xargs) && python3.7 compact_history.py
Manbearpixel commented 5 years ago

@whizz Do you have that script set to run on a cronjob every so often?

whizz commented 5 years ago

Do you have that script set to run on a cronjob every so often?

@Manbearpixel Not really, as this was the first time this happened to me. In Bitcoin, it's bound to happen every 15 months, so you could probably set it to 1 year schedule, but for these kinds of jobs that happen only very rarely, I rather set a reminder and run it manually, because a lot of things can happen in a year.

If you were to set it in cron, you'd have to stop electrumx before however and then start it again. This depends on your system of choice, in my case that would be something like:

systemctl stop electrumx && \
export $(cat /etc/electrumx.conf | xargs) && \
python3.7 compact_history.py && \
systemctl start electrumx
whizz commented 5 years ago

FYI: Given that #681 was merged, the script is now electrumx_compact_history, instead of compact_history.py. The script is installed alongside the main electrumx_server when you do the install.

cluelessperson commented 5 years ago

Here's how I run it.

su - <user> -s /bin/bash

export DB_DIRECTORY=/mnt/bitcoin/electrumx/testnet
export COIN=BitcoinSegwit
export NET=testnet

/opt/electrumx_venv/bin/python3.7 /opt/electrumx_compact_database.py
Talkless commented 4 years ago

I've just started getting same error:

Nov 06 21:57:23 odroid-hc1 electrumx[26635]: ERROR:root:task crashed: <Task finished coro=<Controller.serve() done, defined at /opt/python3.6/lib/python3.6/site-packages/electrumX-1.13.0-py3.6.egg/electrumx/server/controller.py:81> exception=error("'H' format requires 0 <= number <= 65535",)>

I'll try that compacting stuff.

nyetwurk commented 4 years ago

This is much harder to do if you are unfortunate enough to be using docker.

bauerj commented 4 years ago

Why?

pandaatrail commented 4 years ago

Hi,

Could it be possible to get an exhaustive, pretty simple and comprehensive explanation of the @erasmospunk 's sentence "However this is not the case if the node is not fully synced and ElectrumX manages to catch up the full node" ?

I'm new to the blockchain world, and need to depoy ElectrumX in a professionnal environment, so I really need to understand this limitation which happened twice on our side, so that I can explain it back to the guys I work with. If the explanation I give is clear, I'll know I'll know I understood correctly.

Here is what I understand:

This last part is not obvious for me, so if you can take some time to explain, I would be very grateful.

Anyway thanks a lot for you work on this project.

Have a good day,

kyuupichan commented 4 years ago

A flush count is stored in keys in the database in 2 bytes, to save space, so when the number of flushes to the DB reaches 2^16 it fails. The script compact_history compacts the history to a smaller number of flushes.

Whilst it would be possible to increase the 2 bytes to 4 say, really the DB does need to be compacted on a regular basis, and so I have not done this.

pandaatrail commented 4 years ago

Thank you very much for your (very fast) answer.

We're thinking about going for 2 instances of ElectrumX server so that one is still available while the other has its history compacted. A load-balancer would be in front of these 2 instances.

This way we could use the script on a regular basis, say every week or so, and the things could go smoothly.

Thanks again.

useramuser commented 4 years ago

I get this problem when i run the script like this (Environment variables are exported from electrumx.conf): python3 compact_history.py

Error: Traceback (most recent call last): File "compact_history.py", line 33, in from server.env import Env ModuleNotFoundError: No module named 'server'

In the script there are these both lines: from server.env import Env from server.db import DB

What am I doing wrong?