citp / BlockSci

A high-performance tool for blockchain science and exploration
https://citp.github.io/BlockSci/
GNU General Public License v3.0
1.34k stars 259 forks source link

Blocksci database generation seems to stall at 580379 #338

Closed joequant closed 4 years ago

joequant commented 5 years ago

Please provide a clear and concise description of the problem.

I've been attempting to generate the blocksci database from scratch and it seems to stall at block 580379

What happens at that point is that blocksci seems to only touch the file bloom_pubkey_scriptStore.dat and that time stamp changes but nothing else happens.

Reproduction Steps

import blocksci

System Information

Using AMI: yes/no
BlockSci version: (please provide a commit id if you're on a development branch)
Blockchain: (e.g., Bitcoin, Bitcoin Cash, Litecoin)
Parser: Disk/RPC
Total memory: XX GB

maltemoeser commented 5 years ago

What do you mean by generating the database? Running the parser? Without any information about your system or what you're doing it's hard to tell what the issue is.

joequant commented 5 years ago

Hi. I've got a clean machine and I'm running the parser to generate the blocksci database from the bitcoin node. Whenever I run the parser, the generation stalls at block 580379. The parser seems to run perfectly until it hits block 580379 and then it stops. After it hits 580379, I've had the parser run for days without any addition activity, The way that I run the parser is that I do it in blocks (parse the blocks in 200k increments).

Looking at the files, Once the system gets into this state, the time stamp of bloom_pubkey_scriptStore.dat gets updated, but none of the other files change. Also once the system gets into this state, the parser appears to be busy reading files, but there are no changes to any other files. This is running with a machine with 64G of ram. Everything appears to run fine until it hits 580379, and then it goes into an infinite loop. Parsing the blocks in different increments does not appear to change things.

What I'm doing to debug the problem is that I've parsed the blocks up to 560000, and I can run the parser in a debugger to see exactly where it gets stuck at 580379. My guess is that it is hitting some resource limit.

On Sun, Nov 10, 2019 at 8:10 AM Malte Möser notifications@github.com wrote:

What do you mean by generating the database? Running the parser? Without any information about your system or what you're doing it's hard to tell what the issue is.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/citp/BlockSci/issues/338?email_source=notifications&email_token=AAWGAGLBKCLXK2Z7OTGMNNDQS5GOTA5CNFSM4JIEUVVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDUSDBQ#issuecomment-552149382, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWGAGKVD4KEYE4RRY76M4DQS5GOTANCNFSM4JIEUVVA .

mplattner commented 5 years ago

Parsing further than 580379 works fine on my machine. (Also 64GB RAM.) Can you try to do a full re-parse without doing the parsing incrementally, instead use --max-block -6.

If you can provide further information (eg. a stack trace) at the point where the parser stalls we can try to futher look into this issue. If you do, please also let me know what revision (commit) your binary is based on.

It might be that the issue is caused by the Bitcoin node files, which are slightly different for every full sync. Thus, as a temporary workaround, it might also help to re-sync the Bitcoin data and do a full re-parse. The more relevant information you can provide, the easier it is for us to fix. Thanks.

joequant commented 5 years ago

Okay. This is part of a project which is on hold temporarily, but I'll let you know as when the project I'm working moves out of hold.

maltemoeser commented 4 years ago

Feel free to reopen this if it becomes relevant again.