nochowderforyou / clams

Clam Project
MIT License
61 stars 58 forks source link

Bootstrap import ignores >300MB worth of already downloaded blk0001.dat #313

Open TheButterZone opened 7 years ago

TheButterZone commented 7 years ago

I pulled my cable to ensure it wasn't downloading what I'd already downloaded, and I went from 349,000 or so blocks down to 0 after putting bootstrap.dat in the same directory. WTF?!

As people either a) download the bootstrap after getting frustrated at slow P2P speed with a partial blk0001.dat or b) download & insert the bootstrap near the time of first-run, shouldn't the client be able to cat for the a) people automatically?

tryphe commented 7 years ago

That's how it works. You put it in the directory, and your database goes bye bye, because it's not in any particular order (those files are just digested SSTs based on local leveldb settings).

TheButterZone commented 7 years ago

So I've wasted all that time & data cap downloading from peers that ground to a near-halt. And instead of overwriting the blk0001.dat file, it's just getting added filesize from where I left off >300, to now >500. Will the duplicate shit be cleared, or clearable, out of it?

ETA: Maybe that series of

ERROR: ProcessBlock() : already have block

means it is just filling in blk0001.dat with missing blocks present in bootstrap & not appending the same ones that were already downloaded P2P? It didn't seem like the filesize started going up again until it went back to regular ProcessBlock, but I only caught it 5 seconds before the switchback.

ETA2: No, that doesn't seem right, it seemed like 1K per block before bootstrap, and the file's around 572mb now with 243739 one of the latest to scroll by in the "Now" debug.log

Maybe there should be a prompt on first run, then: There are this many up-to-date peers: x Would you like to download the blockchain from them, or the bootstrap torrent or DDL?

dooglus commented 7 years ago

I've never seen any blocks get lost as a result of importing a bootstrap.dat file, and I've imported a lot of them.

It will scan the bootstrap.dat file looking for blocks you don't already have, and that will count up through all the blocks in the file. You'll see a bunch of "already have block" messages for the ones you already have. It won't re-import blocks you already have.

Recent versions of Bitcoin will copy the new blocks it finds in the bootstrap.dat file into the blkNNNNN.dat files before importing them into the database, so that may explain why you're seeing the blk*.dat file(s) grow more than you expect. I forget which version of Bitcoin the last release of the CLAM client was based on, but suspect that its behavior is similar.

I don't think there's any need for a warning. Importing the bootstrap.dat file doesn't delete anything, and doesn't use any bandwidth.

dooglus commented 7 years ago

Here's an example from the log of a Bitcoin bootstrap.dat I'm importing at the moment:

2017-08-29 19:10:51 Pre-allocating up to position 0x1000000 in blk00136.dat
2017-08-29 19:10:52 Pre-allocating up to position 0x2000000 in blk00136.dat
2017-08-29 19:10:52 Pre-allocating up to position 0x3000000 in blk00136.dat
2017-08-29 19:10:53 Pre-allocating up to position 0x4000000 in blk00136.dat
2017-08-29 19:10:53 Pre-allocating up to position 0x5000000 in blk00136.dat
2017-08-29 19:10:54 Pre-allocating up to position 0x6000000 in blk00136.dat
2017-08-29 19:10:54 Pre-allocating up to position 0x7000000 in blk00136.dat
2017-08-29 19:10:55 Pre-allocating up to position 0x8000000 in blk00136.dat
2017-08-29 19:10:55 Leaving block file 136: CBlockFileInfo(blocks=597, size=134161673, heights=298291...298887, time=2014-04-29...2014-05-03)
2017-08-29 19:10:58 Pre-allocating up to position 0x1000000 in blk00137.dat
2017-08-29 19:10:58 Pre-allocating up to position 0x2000000 in blk00137.dat
2017-08-29 19:10:59 Pre-allocating up to position 0x3000000 in blk00137.dat
2017-08-29 19:10:59 Pre-allocating up to position 0x4000000 in blk00137.dat
2017-08-29 19:11:00 Pre-allocating up to position 0x5000000 in blk00137.dat
2017-08-29 19:11:01 Pre-allocating up to position 0x6000000 in blk00137.dat
2017-08-29 19:11:01 Pre-allocating up to position 0x7000000 in blk00137.dat
2017-08-29 19:11:02 Pre-allocating up to position 0x8000000 in blk00137.dat
2017-08-29 19:11:02 Leaving block file 137: CBlockFileInfo(blocks=590, size=134195296, heights=298888...299477, time=2014-05-03...2014-05-07)
2017-08-29 19:11:06 Pre-allocating up to position 0x1000000 in blk00138.dat
2017-08-29 19:11:06 Pre-allocating up to position 0x2000000 in blk00138.dat
2017-08-29 19:11:07 Pre-allocating up to position 0x3000000 in blk00138.dat
2017-08-29 19:11:07 Pre-allocating up to position 0x4000000 in blk00138.dat
2017-08-29 19:11:08 Pre-allocating up to position 0x5000000 in blk00138.dat
2017-08-29 19:11:09 Pre-allocating up to position 0x6000000 in blk00138.dat
2017-08-29 19:11:09 Pre-allocating up to position 0x7000000 in blk00138.dat
2017-08-29 19:11:10 Pre-allocating up to position 0x8000000 in blk00138.dat
2017-08-29 19:11:10 Loaded 50000 blocks from external file in 647735ms
2017-08-29 19:11:17 UpdateTip: new best=000000000000003887df1f29024b06fc2200b55f8af8f35453d7be294df2d214 height=250000 version=0x00000002 log2_work=71.012098 tx=21491097 date='2013-08-03 12:36:23' progress=0.085733 cache=0.1MiB(718txo)
2017-08-29 19:11:21 UpdateTip: new best=000000000000001b3f536a81be90d5cbe8b79c2c1df53d1f91540cf5cb5a7c58 height=250001 version=0x00000002 log2_work=71.012195 tx=21491225 date='2013-08-03 12:47:32' progress=0.085733 cache=0.2MiB(1412txo)
2017-08-29 19:11:22 UpdateTip: new best=000000000000006b0b79274e9cfdfeaa89196a2281bc92493b1a1e74f2eac087 height=250002 version=0x00000002 log2_work=71.012292 tx=21491463 date='2013-08-03 12:48:37' progress=0.085734 cache=0.3MiB(1947txo)
2017-08-29 19:11:24 UpdateTip: new best=0000000000000054502d8fc7843719bd20d6094ea9a3ea8e4f4a7b9862fb45c2 height=250003 version=0x00000002 log2_work=71.01239 tx=21491794 date='2013-08-03 13:00:11' progress=0.085736 cache=0.4MiB(2659txo)
2017-08-29 19:11:25 UpdateTip: new best=000000000000001ad8cc4aafb8db55b0e4444fad216ae63f26cbfe9adb6031a9 height=250004 version=0x00000002 log2_work=71.012487 tx=21491958 date='2013-08-03 13:07:53' progress=0.085736 cache=0.4MiB(2933txo)

All the 'pre-allocating ...' messages happen as it copies the raw block data from bootstrap.dat into the various blk*.dat files, and only once it has finished doing that does it start doing the UpdateTip stuff.

TheButterZone commented 7 years ago

The only blk*.dat file in my ~/Library/Application Support/Clam & subdirectories is blk0001.dat

debug.log does not contain "Pre-allocating" at this point, so the behavior is divergent.

What was a waste of bandwidth was to download > 300 mb P2P first, then have to download the bootstrap second, which contains same > 300 mb already downloaded P2P. The quickest method should be selected first & stuck to, so you don't end up downloading stuff twice.

"already have block" ran out of search results at 620695

Coming up on ProcessBlock... 1095000

accttotech commented 7 years ago

Are you sure about this? I am a few weeks behind so I decided to downloaded the entire bootstrap.dat file a few minutes ago to see if I would run into the same problem.

The bootstrap.dat file didn't delete any of the current database and seems to be working fine as you can see from "importing blocks..." on the bottom left side of the Clamclient.

Here's a screenshot: http://imgur.com/a/3Ja75

Unless I'm missing something here, please advise.

Thanks, Justin

On Wed, Aug 30, 2017 at 3:44 AM, Alwin Roe notifications@github.com wrote:

The only blk*.dat file in ~/Library/Application Support/Clam & subdirectories is blk0001.dat

debug.log does not contain "Pre-allocating" at this point.

What was a waste of bandwidth was to download > 300 mb P2P first, then have to download the bootstrap second, which contains same > 300 mb already downloaded P2P.

Coming up on ProcessBlock... 1095000

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nochowderforyou/clams/issues/313#issuecomment-325924869, or mute the thread https://github.com/notifications/unsubscribe-auth/ATLS5eRMG649eA0suhbI2_9rythIsntaks5sdSD-gaJpZM4PFJge .

dooglus commented 7 years ago

What was a waste of bandwidth was to download > 300 mb P2P first, then have to download the bootstrap second, which contains same > 300 mb already downloaded P2P

Yes. That's why on my bootstrap post I have split the file into pieces each containing 10k blocks:

I also made a series of 'partial' bootstrap files. Each one contains the block data for 10,000 blocks.

https://s3.amazonaws.com/dooglus/bootstrap-000.dat is blocks 0 through 9999
https://s3.amazonaws.com/dooglus/bootstrap-001.dat is blocks 10000 through 19999
https://s3.amazonaws.com/dooglus/bootstrap-002.dat is blocks 20000 through 29999
etc.

I'll add a new one for each new set of 10k blocks. Currently they go up to bootstrap-165.dat.

That way you can download just the pieces you need.

TheButterZone commented 7 years ago

I had tried the partial bootstrap starting range where the client left off & the "current number of blocks" went from 349,000 or so blocks down to 0. Then I tried the full bootstrap. Back to 0. Importing bootstrap shouldn't make it look like you lost 100% of your progress to date.

accttotech commented 7 years ago

Oh okay, so is it working now though?

On Wed, Aug 30, 2017 at 1:27 PM, Alwin Roe notifications@github.com wrote:

I tried the partial bootstrap starting where the client left off & the count went from 349,000 or so blocks down to 0. Then I tried the full one. Back to 0.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nochowderforyou/clams/issues/313#issuecomment-326078308, or mute the thread https://github.com/notifications/unsubscribe-auth/ATLS5egBFjfSflYDUF1addwWqkeBHpQ_ks5sdamNgaJpZM4PFJge .

-- Justin Abraham

On Wed, Aug 30, 2017 at 1:27 PM, Alwin Roe notifications@github.com wrote:

I tried the partial bootstrap starting where the client left off & the count went from 349,000 or so blocks down to 0. Then I tried the full one. Back to 0.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nochowderforyou/clams/issues/313#issuecomment-326078308, or mute the thread https://github.com/notifications/unsubscribe-auth/ATLS5egBFjfSflYDUF1addwWqkeBHpQ_ks5sdamNgaJpZM4PFJge .

TheButterZone commented 7 years ago

Define "working". bootstrap.dat is 1.84GB, blk0001.dat is 1.43GB at 48 weeks behind.

TheButterZone commented 7 years ago

blk0001.dat has exceeded bootstrap.dat size by 0.04 GB & counting, without being able to connect to internet. 06/01/17 scrolling past now. Waiting for it to run out of bootstrap & then I'll start downloading the segments, renaming them to bootstrap.dat & restarting after each completed.