citp / BlockSci

A high-performance tool for blockchain science and exploration
https://citp.github.io/BlockSci/
GNU General Public License v3.0
1.34k stars 259 forks source link

Incorrect balance #330

Closed sudo-su-ekin closed 5 years ago

sudo-su-ekin commented 5 years ago

I can't get a correct balance from an address. I ran blocksci_parser config/file update at block 599041 but it is not returning the correct value at that height.

Config file

{
    "chainConfig": {
        "coinName": "bitcoin",
        "dataDirectory": "/path/to/blocksci_data/bitcoin",
        "pubkeyPrefix": [
            0
        ],
        "scriptPrefix": [
            5
        ],
        "segwitActivationHeight": 481824,
        "segwitPrefix": "bc"
    },
    "parser": {
        "disk": {
            "blockMagic": 3652501241,
            "coinDirectory": "/path/to/.bitcoin",
            "hashFuncName": "doubleSha256"
        },
        "maxBlockNum": 0
    },
    "version": 5
}

Reproduction Steps

import blocksci
chain = blocksci.Blockchain("/path/to/bitcoin_config")
address = chain.address_from_string("3EhLZarJUNSfV6TWMZY1Nh5mi3FMsdHa5U")
address.balance(599041)

Return value: 126013326213

System Information

Using AMI: no, r4.2xlarge BlockSci version: 0.6 Blockchain: Bitcoin
Parser: Disk
Total memory: 61 GB
OS: Ubuntu 18.04

maltemoeser commented 5 years ago

Running your example I get the result 15391, which is matched by block explorers (curiously blockchain.info displays something different).

Is your open files limit correctly set?

sudo-su-ekin commented 5 years ago

Yes, the open file limits should be ok according to the instructions. I'm getting:

ubuntu@ip:~$ ulimit -Sn
> 64000
ubuntu@ip:~$ ulimit -Hn
> 64000

The full limits look like this:

ubuntu@ip:~$ ulimit -Sa
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 245503
> max locked memory       (kbytes, -l) 16384
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 64000
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 245503
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited

ubuntu@ip:~$ ulimit -Ha
> core file size          (blocks, -c) unlimited
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 245503
> max locked memory       (kbytes, -l) 16384
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 64000
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) unlimited
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 245503
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
maltemoeser commented 5 years ago

Hm, can you re-parse and see if the balance is still incorrect?

sudo-su-ekin commented 5 years ago

It seems that re-parsing helped! Getting the correct result now.

On a different note, what is the purpose of the warmup.sh script? I'm not running it on reboot now and still getting the correct result. So I was wondering if it is necessary to run it at all, or should that be done?

maltemoeser commented 5 years ago

Great!

You can ignore the warmup script. It is used to automatically fix a performance issue (see below) in the AMI but not intended/necessary for manual use.

AWS instances suffer from a known performance issue when starting up from an existing AMI. When the machine starts up it doesn't actually load all of the data on the disk so that startup can be instant. Instead it only loads the data when it is accessed for the first time. Thus BlockSci will temporarily operate slowly when the image has first been launched. Within about 20 minutes after launch, the most crucial data files will be loaded to disk from the network, and most queries should run at full speed, including all examples in the demo Notebook. After about 3.5 hours, all data will be loaded to disk and all queries will reach full speed.

There is no need for user intervention to resolve this issue since the machine will do so automatically on launch.