CityOfZion / neo-scan

Blockchain explorer for NEO
https://neoscan.io
MIT License
70 stars 65 forks source link

synchronization stalls #400

Open lazovskiy opened 5 years ago

lazovskiy commented 5 years ago

Hello. I've faced to synchronization issue that's highly likely conducted with Postgres stored procedures operations.

The _flush_blocksqueue function stops updating _blocksmeta table causing _getheight api call to stall. Also _blocksqueue table starts continiously growing and blocks table is not populated anymore.

If I do _SELECT flush_blocksqueue(); mannualy I get: NOTICE: performed address_balances_queue flush: 0 updates, 0 prunes even though _blocksqueue has loads of entries.

I'm using neo-scan from the git head.

Any suggestions? Thanks!

adrienmo commented 5 years ago

Hello @lazovskiy this is probably because one block is missing and flush_blocks_queue cannot perform flush without this block data, the database is probably corrupted. You can check the table blocks and sort by block index and check if there is or not a value missing there. In which case you need to erase the db and resync

lazovskiy commented 5 years ago

Thank you for reply. I've looked up for any errors from postgres with no luck. Everything works as expected. I've tried to change postgres version 9.6 to 11.2: this time synchronization goes further but nevertheless stopped at some point:

neo_scan=# select * from blocks_meta;
 id |  index  | cumulative_sys_fee 
----+---------+--------------------
  1 | 3226849 |             267102
(1 row)
neo_scan=# select count(*) from blocks_queue;
 count  
--------
 319238
(1 row)
lazovskiy commented 5 years ago

Also is it possible to roll back the database to some point in order to not start synchronization from very beginning. Or maybe some bootstrap dumps exist out there? @adrienmo

l-vitall commented 5 years ago

Guys this is not the question. We cannot run neo-scan locally because it stucks all the time at some point, sometimes before it is even fully synchronized. We need to understand how to avoid this stuck. Just to delete-and-start-sync-again is not a good approach for live service with many blockchains (we need it for online transactions processing)

adrienmo commented 5 years ago

@lazovskiy there are no dumps for neoscan. It is also not possible to rollback at the moment.

@l-vitall I agree with you, however I don't know how to reproduce the problem, on neoscan server (staging/production) I never ran into this issue. If you have more details on error message around the missing block maybe it could give helpful hints