gcarq / rusty-blockparser

Bitcoin Blockchain Parser written in Rust language
GNU General Public License v3.0
368 stars 144 forks source link

balances seem incomplete #88

Closed jrp27514 closed 1 year ago

jrp27514 commented 1 year ago

yesterday ran 'balances' from 0..784760 and out of curiosity added up totals, but can only account for 14.6M btc out of ~190+M mined. Not sure what to make of this. Would like to have more confidence in the rusty data though. Thanks for any thoughts

14610437.97744718

gcarq commented 1 year ago

I assume this is for bitcoin. Are you using the https://github.com/gcarq/rusty-blockparser/tree/use-bitcoin-lib branch? The master branch lacks the implementation of newer transaction types like SegWit and cannot handle BECH32 addresses.

jrp27514 commented 1 year ago

Yes, bitcoin is what I am playing with. I did a git clone in Jan and run v 0.8.2. The release page I see makes no mention of needing a a branch but I will give that a try. I assumed updates were all built into a release candidate when ready. Is there some forum where more recent info/updates is to be found? Thanks

On Tue, Apr 11, 2023 at 4:14 PM Michael Egger @.***> wrote:

I assume this is for bitcoin. Are you using the https://github.com/gcarq/rusty-blockparser/tree/use-bitcoin-lib branch? The master branch lacks the implementation of newer transaction types like SegWit and cannot handle BECH32 addresses.

— Reply to this email directly, view it on GitHub https://github.com/gcarq/rusty-blockparser/issues/88#issuecomment-1504033808, or unsubscribe https://github.com/notifications/unsubscribe-auth/A45FV57YYFSI6XHJFTQY46DXAW3TDANCNFSM6AAAAAAW2WGXWA . You are receiving this because you authored the thread.Message ID: @.***>

gcarq commented 1 year ago

Currently there is just a note in the README regarding this, but there will be a future release with a special bitcoin version to make the usage a bit more clear.

This is currently required to fully support bitcoin via rusts-bitcoin also keep supporting all the altcoins.

Let me know if you find any mismatches in the use-bitcoin-lib branch.

jrp27514 commented 1 year ago

Michael, millions of these logs using new branch after cargo build --release. Is there an easy way to suppress all these or did I make some err?

./rusty-blockparser-use-bitcoin-lib/target/release/rusty-blockparser -d /media/jrp/JRP_STUFF/bitcoin/blocks balances /media/jrp/JRP_STUFF/Temp/rustyout

[12:51:23] WARN - script: Unable to extract evaluated address: script is not a p2pkh, p2sh or witness program [12:51:23] WARN - script: Unable to extract evaluated address: script is not a p2pkh, p2sh or witness program ...

Message ID: @.***>

gcarq commented 1 year ago

No error on your side, this warning just indicates that the transaction cannot be processed (e.g.: if someone used OP_RETURN to store arbitrary data on the chain). I will push a new version the next days which should address this.

gcarq commented 1 year ago

I pushed some major changes to master branch which should address the incomplete balances and excessive warnings. Can you try it out and give some feedback? I will give it some more testing an publish a new versions if there are no major showstoppers.

jrp27514 commented 1 year ago

I downloaded the master branch, cd'd into it and did a cargo build --release

./target/release/rusty-blockparser -d /media/jrp/JRP_STUFF/bitcoin/blocks unspentcsvdump /media/jrp/JRP_STUFF/Temp/rustyout

[12:02:37] INFO - main: Starting rusty-blockparser v0.9.0 ... [12:02:37] INFO - index: Reading index from /media/jrp/JRP_STUFF/bitcoin/blocks/index ... [12:02:38] INFO - index: Got longest chain with 785391 blocks ... [12:02:38] INFO - blkfile: Reading files from /media/jrp/JRP_STUFF/bitcoin/blocks ... [12:02:38] ERROR - rusty_blockparser: Cannot load blockchain from: '/media/jrp/JRP_STUFF/bitcoin/blocks'. I/O Error: Too many open files (os error 24)

the regular installed version /home/jrp/.cargo/bin/rusty-blockparser still runs fine

On Sat, Apr 15, 2023 at 6:49 AM Michael Egger @.***> wrote:

I pushed some major changes to master branch which should address the incomplete balances and excessive warnings. Can you try it out and give some feedback? I will give it some more testing an publish a new versions if there are no major showstoppers.

— Reply to this email directly, view it on GitHub https://github.com/gcarq/rusty-blockparser/issues/88#issuecomment-1509724512, or unsubscribe https://github.com/notifications/unsubscribe-auth/A45FV52FY6AYY6SGTI7CIHDXBJ4J7ANCNFSM6AAAAAAW2WGXWA . You are receiving this because you authored the thread.Message ID: @.***>

gcarq commented 1 year ago

Can you give me the output of ulimit -a to see the open file limits? I added an I/O optimization to keep .blk file readers open to avoid I/O overhead which causes this. Temporarily raising the file limit might fix this.

jrp27514 commented 1 year ago

@.***:~/rusty-blockparser-master/rusty-blockparser-master$ ulimit -a real-time non-blocking time (microseconds, -R) unlimited core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 127510 max locked memory (kbytes, -l) 4094912 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 127510 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited

On Sat, Apr 15, 2023 at 8:19 AM Michael Egger @.***> wrote:

Can you give me the output of ulimit -a to see the open file limits? I added an I/O optimization to keep .blk file readers open to avoid I/O overhead which causes this. Temporarily raising the file limit might fix this.

— Reply to this email directly, view it on GitHub https://github.com/gcarq/rusty-blockparser/issues/88#issuecomment-1509759398, or unsubscribe https://github.com/notifications/unsubscribe-auth/A45FV52YJEW6LYVVJ2ELO6DXBKG4LANCNFSM6AAAAAAW2WGXWA . You are receiving this because you authored the thread.Message ID: @.***>

jrp27514 commented 1 year ago

yes for now I raised the /etc/security/limits and seems to be running

On Sat, Apr 15, 2023 at 8:19 AM Michael Egger @.***> wrote:

Can you give me the output of ulimit -a to see the open file limits? I added an I/O optimization to keep .blk file readers open to avoid I/O overhead which causes this. Temporarily raising the file limit might fix this.

— Reply to this email directly, view it on GitHub https://github.com/gcarq/rusty-blockparser/issues/88#issuecomment-1509759398, or unsubscribe https://github.com/notifications/unsubscribe-auth/A45FV52YJEW6LYVVJ2ELO6DXBKG4LANCNFSM6AAAAAAW2WGXWA . You are receiving this because you authored the thread.Message ID: @.***>

jrp27514 commented 1 year ago

Michael

something seems wrong w/ balances. Seems to be running but giving me zero size csv output file. Any reason that change to open file limit would mess that up?

On Sat, Apr 15, 2023 at 8:19 AM Michael Egger @.***> wrote:

Can you give me the output of ulimit -a to see the open file limits? I added an I/O optimization to keep .blk file readers open to avoid I/O overhead which causes this. Temporarily raising the file limit might fix this.

— Reply to this email directly, view it on GitHub https://github.com/gcarq/rusty-blockparser/issues/88#issuecomment-1509759398, or unsubscribe https://github.com/notifications/unsubscribe-auth/A45FV52YJEW6LYVVJ2ELO6DXBKG4LANCNFSM6AAAAAAW2WGXWA . You are receiving this because you authored the thread.Message ID: @.***>

jrp27514 commented 1 year ago

Could you also clarify the exact nature of balances vs unspent? The readme is not all that clear. In my mind balances implies all addresses with non zero balances, but unspent also sounds identical. Am not able to run balances at all at the moment (as I mentioned in prev email) but I did run the unspent using your new branch and wrote some py code to total them up:

unspent (using new master branch) 45,499,106 rows of data - balance totals from column#4 of csv data: 8,086,732.23651070 balances (2 weeks ago using release ver) 31,220,186 rows of data

neither seems close to what I expected since 190M bitcoin has been mined

On Sat, Apr 15, 2023 at 8:19 AM Michael Egger @.***> wrote:

Can you give me the output of ulimit -a to see the open file limits? I added an I/O optimization to keep .blk file readers open to avoid I/O overhead which causes this. Temporarily raising the file limit might fix this.

— Reply to this email directly, view it on GitHub https://github.com/gcarq/rusty-blockparser/issues/88#issuecomment-1509759398, or unsubscribe https://github.com/notifications/unsubscribe-auth/A45FV52YJEW6LYVVJ2ELO6DXBKG4LANCNFSM6AAAAAAW2WGXWA . You are receiving this because you authored the thread.Message ID: @.***>

gcarq commented 1 year ago

something seems wrong w/ balances. Seems to be running but giving me zero size csv output file. Any reason that change to open file limit would mess that up?

Did you let it finish and can you post the console output? Its possible that something got messed up. I tested it with ./rusty-blockparser -e 100000 balances . and everything worked fine on my side.

Could you also clarify the exact nature of balances vs unspent?

unspentcsvdump and balances callbacks are very similar, roughly speaking there can be more unspent transaction outputs(=UTXO) for an address and this is the representation how bitcoin stores things, there is no direct concept of balances, but only UTXOs.

All that the balances callback does is to aggregate the UTXOs and sum up the balance for each address.

yesterday ran 'balances' from 0..784760 and out of curiosity added up totals, but can only account for 14.6M btc out of ~190+M mined. Not sure what to make of this. Would like to have more confidence in the rusty data though. Thanks for any thoughts

I gathered balances up to block 718021 with master branch and calculated the total balance which is 18926631.95789196 BTC:

>>> filename = 'balances-0-718021.csv'
>>> total_balance = 0
>>> with open(filename) as fp:
...     for (i, line) in enumerate(fp.readlines()):
...         if i == 0:
...             continue
...         address, balance = line.split(';')
...         total_balance += int(balance)
... 

>>> 
>>> total_balance
1892663195789196
>>> total_balance / 100_000_000
18926631.95789196

This is pretty close to the number of mined bitcoins at block 718021 reported by bitcoin explorers: 19171331.657 BTC. The difference is a missing of ~244699 BTC which might be due to some transactions not parsed e.g.: P2MS and invalid transactions like OP_RETURN. Not sure how you came up with ~190+M mined?

jrp27514 commented 1 year ago

the new master branch worked fine for me the 2nd time. Not sure why it had zero content in the csv first time. Also I got the same math for btc totals as you. Thanks for your help

On Sun, Apr 16, 2023 at 4:01 PM Michael Egger @.***> wrote:

something seems wrong w/ balances. Seems to be running but giving me zero size csv output file. Any reason that change to open file limit would mess that up?

Did you let it finish and can you post the console output? Its possible that something got messed up. I tested it with ./rusty-blockparser -e 100000 balances . and everything worked fine on my side.

Could you also clarify the exact nature of balances vs unspent?

unspentcsvdump and balances callbacks are very similar, roughly speaking there can be more unspent transaction outputs(=UTXO) for an address and this is the representation how bitcoin stores things, there is no direct concept of balances, but only UTXOs.

All that the balances callback does is to aggregate the UTXOs and sum up the balance for each address.

yesterday ran 'balances' from 0..784760 and out of curiosity added up totals, but can only account for 14.6M btc out of ~190+M mined. Not sure what to make of this. Would like to have more confidence in the rusty data though. Thanks for any thoughts

I gathered balances up to block 718021 with master branch and calculated the total balance which is 18926631.95789196 BTC:

filename = 'balances-0-718021.csv' total_balance = 0 with open(filename) as fp: ... for (i, line) in enumerate(fp.readlines()): ... if i == 0: ... continue ... address, balance = line.split(';') ... total_balance += int(balance) ...

total_balance 1892663195789196 total_balance / 100_000_000 18926631.95789196

This is pretty close to the number of mined bitcoins at block 718021 https://explorer.btc.com/btc/block/718021 reported by bitcoin explorers: 19171331.657 BTC. The difference is a missing of ~244699 BTC which might be due to some transactions not parsed e.g.: P2MS and invalid transactions like OP_RETURN. Not sure how you came up with ~190+M mined?

— Reply to this email directly, view it on GitHub https://github.com/gcarq/rusty-blockparser/issues/88#issuecomment-1510473402, or unsubscribe https://github.com/notifications/unsubscribe-auth/A45FV5ZTT552TUZVADJA4FDXBRFZFANCNFSM6AAAAAAW2WGXWA . You are receiving this because you authored the thread.Message ID: @.***>

gcarq commented 1 year ago

I'm glad that it worked out, btw the raised open file limit should no longer be necessary with the 0.9.0 release

jrp27514 commented 1 year ago

Michael, fyi https://crates.io/crates/rusty-blockparser indicates csvdump tx-out is 6 column format, but there is only 5 cols and no pubkey info that I can find.

What happened to the pubkey data and where can I locate it?

Thanks

On Sat, Apr 15, 2023 at 8:19 AM Michael Egger @.***> wrote:

Can you give me the output of ulimit -a to see the open file limits? I added an I/O optimization to keep .blk file readers open to avoid I/O overhead which causes this. Temporarily raising the file limit might fix this.

— Reply to this email directly, view it on GitHub https://github.com/gcarq/rusty-blockparser/issues/88#issuecomment-1509759398, or unsubscribe https://github.com/notifications/unsubscribe-auth/A45FV52YJEW6LYVVJ2ELO6DXBKG4LANCNFSM6AAAAAAW2WGXWA . You are receiving this because you authored the thread.Message ID: @.***>