citp / BlockSci

A high-performance tool for blockchain science and exploration
https://citp.github.io/BlockSci/
GNU General Public License v3.0
1.34k stars 259 forks source link

block_parser update terminated calling std::bad_alloc #444

Open vumilan opened 3 years ago

vumilan commented 3 years ago

Please provide a clear and concise description of the problem.

Hello, has anyone stumbled upon a similar problem?

block_parser conf.json update outputs:

Locking data directory.
100.0% done fetching block headers
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

Reproduction Steps

Clean install of ubuntu 18.04 All steps for installing BlockSci for ubuntu 18.04 in the documentaion including raising the file limit to 64000 and setting time to UTC.

System Information

Using AMI: no
BlockSci version: master Blockchain: Bitcoin Parser: tried both Disk/RPC
Total memory: 859 GB

kaykurokawa commented 3 years ago

Just thought I'd add my experience here as well. Not sure if it's the same issue but the update procedure segfaults shortly after processing block 368451/658270 (after it fetches block headers), and this happened several times. Found this in my kernel log:

Nov 29 22:01:40 blocknode kernel: [1981112.841197] blocksci_parser[22384]: segfault at 38 ip 000055a12ecf8c34 sp 00007fe9697f96d0 error 6 in blocksci_parser[55a12ec66000+5bf000]

I used Bitcoin Core version 0.20.1 , 24 GB of memory, and Ubuntu 18.04.

It had no problem parsing the Litecoin blockchain, I used Litecoin version 0.18.1, so I'm going to downgrade Bitcoin Core to 0.18.1 and see what happens.

alex-btc commented 2 years ago

Hey @kaykurokawa, have you had success with the parser after downgrading Bitcoin Core to 0.18.1?

tyramisoux commented 7 months ago

two years later and same problem. I had to change some code to make it compile with gcc-12 and g++-12 but this seems to work since it is same block: 24.48% done, Block 368451/527155Killed. No solutions? Guess I have to debug this :-( Could not even make the python stuff work with latest tools. Compiles after updating pybind11 and remove some stuff but "import module" fails. Anybody using this tool in 2024? Is there something different?

alex-btc commented 7 months ago

I am using BlockSci right now. I was able to install it on Ubuntu 20.04, installing the exact library versions below and using export CC=/usr/bin/clang-7 and export CXX=/usr/bin/clang++-7 and otherwise following the normal install procedure.

cassandra-driver==3.25.0
requests==2.27.1
pandas==1.4.1
numpy==1.22.3
simplejson==3.17.6

It is a shame to let such a good library die this way. I volunteer to maintain it as long as I am capable (only python code).

If there is anyone capable of coding in C/C++ I think it is worth to update the code to make it work/recognize the new taproot addresses and transactions.

tyramisoux commented 7 months ago

Those crashes we encountered seem to be memory issues. After assign >40GB RAM to the VM running the code it runs smooth. Don't now if there are memleaks maybe or does it really need that lot of memory. The blockchain grew of course in the meantime.

tyramisoux commented 7 months ago

My results: The segfault at Block 368451 did not trigger again after assigned enough memory to the VM. Even I was able to compile the code with Ubuntu 23.10 (after a shitload of fixes for latest compilers) I cannot recommend this since the Python stuff did not work at all. It build a python-module named "_blocksciNone" which should be "_blocksci.cpython-36m-x86_64-linux-gnu.so" (importing blocksci results in blocksci._blocksci not found). Finally I installed "Ubuntu 18.04 LTS" which brings the GCC-7 and C++-7, everything built smooth right out of the box. some people complain about compilation time but with enough memory it all compiled within maybe 15 minutes without errors.

Of course the blocksci_parser takes ages. After power-loss (quite usual here every few days) it restarts from scratch. So it is recommended to parse smaller chunks of about 50000 Blocks each (for the bitcoin blockchain) using the "-xxxxx" parameter in the .cfg file and backup (snapshot the VM) between. Change parameter and run "update" again to start at last parsed block. SSD with enough space is also recommended. Harddrive is a major bottleneck.

shihan11 commented 7 months ago

@tyramisoux What parameters should I change to start at last parsed block?Looking forward to your reply,thank you.