utreexo / utreexod

A fully validating Bitcoin node with Utreexo support
ISC License
86 stars 19 forks source link

utreexod keeps going in a loop in blocks from the year 2015 (Mainnet) #187

Open HalFinneyIsMyHomeBoy opened 1 month ago

HalFinneyIsMyHomeBoy commented 1 month ago

Running a full bitcoin core node. Been running utreexod for a few days, but it looks like it keeps looping over the same blocks from the year 2015 with the log saying. "Adding orphan block"

Running with the following command, using external SSD for utreexod.

./utreexod -C /media/usb-drive/.utreexod/utreexod.conf --flatutreexoproofindex --prune=0

See part of the log attached.

utreexod.log

HalFinneyIsMyHomeBoy commented 1 month ago

(UPDATE)

I deleted everything in

./utreexod/data/mainnet/" and "./utreexod/logs/mainnet/

I started it back up from scratch using

--flatutreexoproofindex --prune=0

Now it's stuck in a loop again with these errors.

2024-05-30 00:50:39.395 [INF] CHAN: Adding orphan block 00000000000001cf787f095dbf33ed410656f63e7a53c02031800a87dd7a71ba with parent 00000000000003ddb4ec0ae4f276eaf32f87abc5a36d12bf45980733b03ec595 2024-05-30 00:50:39.472 [INF] CHAN: Adding orphan block 00000000000004ed1b60a590047ff43c0812b117951714db7a08db40fcf23669 with parent 00000000000001cf787f095dbf33ed410656f63e7a53c02031800a87dd7a71ba 2024-05-30 00:50:40.009 [INF] CHAN: Adding orphan block 000000000000020de49cc58b33c15de9da66b8c13ef671d1d735a3b7170ee1bc with parent 00000000000004ed1b60a590047ff43c0812b117951714db7a08db40fcf23669 2024-05-30 00:50:40.010 [INF] CHAN: Adding orphan block 00000000000000f2f15384bc1591418f60e64c178b79175714e45624ba7f6cd4 with parent 000000000000020de49cc58b33c15de9da66b8c13ef671d1d735a3b7170ee1bc 2024-05-30 00:50:40.115 [INF] CHAN: Adding orphan block 00000000000002d6ca204635197e7d8038eb52d94763410ed5772a5c240d0ef7 with parent 00000000000000f2f15384bc1591418f60e64c178b79175714e45624ba7f6cd4 2024-05-30 00:50:40.621 [INF] CHAN: Adding orphan block 000000000000020a59bab7875a675e111d2c466755f9462b3bec07b4a415b3a5 with parent 00000000000002d6ca204635197e7d8038eb52d94763410ed5772a5c240d0ef7 2024-05-30 00:50:41.013 [INF] CHAN: Adding orphan block 0000000000000270faa53dea43b952b62f2f776f399438827a61b51aa983f1d7 with parent 000000000000020a59bab7875a675e111d2c466755f9462b3bec07b4a415b3a5 2024-05-30 00:50:41.211 [INF] CHAN: Adding orphan block 00000000000000647dc00154375aea0ef050a3ed419d4dbc8863dc010176241a with parent 0000000000000270faa53dea43b952b62f2f776f399438827a61b51aa983f1d7 2024-05-30 00:50:41.309 [INF] CHAN: Adding orphan block 000000000000001bf46590e378c3b587e717adf8efb46ed367696d80e0e6c83c with parent 00000000000000647dc00154375aea0ef050a3ed419d4dbc8863dc010176241a 2024-05-30 00:50:41.371 [INF] CHAN: Adding orphan block 0000000000000365ec6d3bb93f5ef80e4c5a3f9d97db4a5b3abbe8c0b17672b7 with parent 000000000000001bf46590e378c3b587e717adf8efb46ed367696d80e0e6c83c 2024-05-30 00:50:41.527 [INF] CHAN: Adding orphan block 000000000000044ee5e4d652e4a142770216bb0bc4391260741e7fee85d7df8b with parent 0000000000000365ec6d3bb93f5ef80e4c5a3f9d97db4a5b3abbe8c0b17672b7 2024-05-30 00:50:41.564 [INF] CHAN: Adding orphan block 00000000000003229ad3eb7a704559a84b88937539fd4080047cbe018fb2311c with parent 000000000000044ee5e4d652e4a142770216bb0bc4391260741e7fee85d7df8b 2024-05-30 00:50:41.598 [INF] CHAN: Adding orphan block 00000000000000b83d867ee1718ac470cf2f83d59ab0f7395bc2f6319adbe5d1 with parent 00000000000003229ad3eb7a704559a84b88937539fd4080047cbe018fb2311c 2024-05-30 00:50:41.671 [INF] CHAN: Adding orphan block 0000000000000367f749f6e5103bc4d36bd33b63ef8329cd958c4f36dff51232 with parent 00000000000000b83d867ee1718ac470cf2f83d59ab0f7395bc2f6319adbe5d1 2024-05-30 00:50:41.717 [INF] CHAN: Adding orphan block 000000000000010fdb82c631466cb936b3756dd131de3a20a90ad172d9b92679 with parent 0000000000000367f749f6e5103bc4d36bd33b63ef8329cd958c4f36dff51232 2024-05-30 00:50:41.910 [INF] CHAN: Adding orphan block 00000000000002bb5ec64d6dc0d2f09c00eb3cc803137b5fa4a797d5411cb69b with parent 000000000000010fdb82c631466cb936b3756dd131de3a20a90ad172d9b92679 2024-05-30 00:50:41.916 [INF] CHAN: Adding orphan block 000000000000037c0c7188213a4425589beb1c106cf275c6438aac082962df69 with parent 00000000000002bb5ec64d6dc0d2f09c00eb3cc803137b5fa4a797d5411cb69b 2024-05-30 00:50:41.968 [INF] CHAN: Adding orphan block 000000000000029fd1dc9d9822c66425fc7b6445fa7f6561850a888f6101e21e with parent 000000000000037c0c7188213a4425589beb1c106cf275c6438aac082962df69 2024-05-30 00:50:42.016 [INF] CHAN: Adding orphan block 00000000000001dfc3065a66384e1ec9fecdefefa62f94d7051b843819561ab7 with parent 000000000000029fd1dc9d9822c66425fc7b6445fa7f6561850a888f6101e21e 2024-05-30 00:50:42.067 [INF] CHAN: Adding orphan block 000000000000046141d8eb59894c6a243c4cc49c265cbc7893922e8a06bc2186 with parent 00000000000001dfc3065a66384e1ec9fecdefefa62f94d7051b843819561ab7 2024-05-30 00:50:42.075 [INF] CHAN: Adding orphan block 00000000000000602d59e3d23315235882e8256af827710001f7757e40c11bf8 with parent 000000000000046141d8eb59894c6a243c4cc49c265cbc7893922e8a06bc2186 2024-05-30 00:50:42.392 [INF] CHAN: Adding orphan block 000000000000046912c6bcdf9a73261be70d51b123e0540259ff7b4eef13eb39 with parent 00000000000000602d59e3d23315235882e8256af827710001f7757e40c11bf8 2024-05-30 00:50:42.418 [INF] CHAN: Adding orphan block 000000000000003496362f6d2748e4690465ff52e09c255b38a1061cd9e75de9 with parent 000000000000046912c6bcdf9a73261be70d51b123e0540259ff7b4eef13eb39 ^C2024-05-30 00:50:42.536 [INF] BTCD: Received signal (interrupt). Shutting down... 2024-05-30 00:50:42.536 [INF] BTCD: Gracefully shutting down the server... 2024-05-30 00:50:42.536 [WRN] SRVR: Server shutting down 2024-05-30 00:50:42.536 [WRN] RPCS: RPC server shutting down 2024-05-30 00:50:42.536 [INF] RPCS: RPC server shutdown complete 2024-05-30 00:50:42.618 [INF] CHAN: Adding orphan block 00000000000000c3487f30ede8eedaccc148181bbaff4fda97c16938c9bc3e65 with parent 000000000000003496362f6d2748e4690465ff52e09c255b38a1061cd9e75de9 2024-05-30 00:50:42.619 [INF] CHAN: Adding orphan block 000000000000001d638ba2dbe90c0ecb6043406696d4fb9cf6d10b4fb3dbe229 with parent 00000000000000c3487f30ede8eedaccc148181bbaff4fda97c16938c9bc3e65 2024-05-30 00:50:42.640 [INF] SYNC: Lost peer 89.133.2.137:8333 (outbound) 2024-05-30 00:50:42.640 [INF] SYNC: Lost peer 43.245.196.214:8333 (outbound) 2024-05-30 00:50:42.640 [INF] SYNC: Lost peer 176.37.82.83:8333 (outbound) 2024-05-30 00:50:42.640 [INF] SYNC: Lost peer 80.26.186.166:8333 (outbound) 2024-05-30 00:50:42.641 [INF] SYNC: Lost peer 50.39.244.234:8333 (outbound) 2024-05-30 00:50:42.641 [INF] SYNC: Lost peer 174.20.40.190:8333 (outbound) 2024-05-30 00:50:42.641 [INF] SYNC: Lost peer 54.202.35.84:8333 (outbound) 2024-05-30 00:50:42.641 [INF] SYNC: Sync manager shutting down 2024-05-30 00:50:42.641 [INF] SYNC: Lost peer 220.120.210.131:8333 (outbound) 2024-05-30 00:50:42.641 [WRN] SYNC: No sync peer candidates available 2024-05-30 00:50:42.641 [INF] CHAN: Flushing UTXO cache of 97 MiB with 0 entries to disk. For large sizes, this can take up to several minutes... 2024-05-30 00:50:42.641 [INF] AMGR: Address manager shutting down 2024-05-30 00:50:42.715 [INF] SRVR: Server shutdown complete 2024-05-30 00:50:42.715 [INF] BTCD: Gracefully shutting down the database... 2024-05-30 00:50:42.759 [INF] BTCD: Shutdown complete

kcalvinalvin commented 1 month ago

Hey thanks for testing utreexod out. Would it be possible for you post the full logs since the beginning? Also, it'd be helpful if you posted your full config file here as well. The first comment you have where you posted the logs only starts out on the restart. It would help me debug what's actually going on.

My initial guess is that the binary is writing not to the external disk but to the internal disk and that internal disk is not big enough. `--datadir="path/to/dir" is how you tell the binary to write somewhere else and since I don't see that when you ran it, that's my initial guess.

HalFinneyIsMyHomeBoy commented 1 month ago

Sorry, I deleted the logs from the first time, but I have attached all the logs from since I retried from scratch, seems to be running into the same error, just in the 2012 blocks instead of the 2015 blocks. I also attached the config file also.

Here is my current storage space info. I just realized that I never configured utreexod to just get the blockchain data from my full local node running on the same system. I should probably read into how to do that. The main drive has a full bitcoin core node with 84GB free, and my external USB SSD has at least 725GB free.

Screenshot from 2024-05-30 13-16-07

utreexod.log.10.gz utreexod.log.9.gz utreexod.log.8.gz utreexod.log utreexod_conf.zip

HalFinneyIsMyHomeBoy commented 1 month ago

@kcalvinalvin UPDATE

I think it might be my fault. I suspect the external USB SSD I was using was old and faulty. I am now resyncing from scratch using a new SSD and so far it's made it to block 392000 with no errors. I'll update with progress once it finishes.

HalFinneyIsMyHomeBoy commented 1 month ago

@kcalvinalvin

So after resyncing from scratch on a new SSD, everything was running fine until I closed it. Even though everything was shutdown gracefully, minutes later I tried to resume syncing and it started running into the same "Adding orphan block" error. I am syncing from another local umbrel full node of mine to speed syncing up.

See log attached (had to rename it .txt because .log files are not allowed)

utreexod.txt

kcalvinalvin commented 1 month ago

Hey! Thanks for trying it out again. Could you share me the flags for the conf file you had for the last log file?

I do think it might be a bug and it'd help be reproduce if I had the flags used there

HalFinneyIsMyHomeBoy commented 1 month ago

Sure thing, the only flags I had for this conf are the following. The last conf did not have "connect=192.168.1.46:8333"

datadir=/media/usb-drive/.utreexod/data logdir=/media/usb-drive/.utreexod/logs connect=192.168.1.46:8333

HalFinneyIsMyHomeBoy commented 3 weeks ago

Update:

Bought a used server on ebay. (40 cores, 256GB of RAM, enterprise SSD, running promox) Resyncing utreexod from my local node again from scratch. It's up to block 381,000 so far after 2 days.

I have shutdown and started utreexod about 5 times so far and it is still syncing without error (so far). Running default config, except for "connect=192.168.1.46:8333" (no external SSD this time)

I'll let you know what happens.

kcalvinalvin commented 1 week ago

Hey. Just wondering if you had any luck this time around

HalFinneyIsMyHomeBoy commented 1 week ago

About 2 weeks ago, I had a power outage while syncing, and it gave the same "Adding orphan block " error loop.

Had to start all over again, It's been running non-stop for 2 weeks now.
Currently up to block 541,000 with 20 cores and 108GB of RAM used out of 125GB.

I'll let you know when it finishes, probably another 2 weeks at this rate. : )

kcalvinalvin commented 1 week ago

Had to start all over again, It's been running non-stop for 2 weeks now. Currently up to block 541,000 with 20 cores and 108GB of RAM used out of 125GB.

Hmm that shouldn't be happening. I have a system with 96GB of RAM and was able to sync in less than a day without running out of memory. I could help out and see what's going on if you could share your config.

HalFinneyIsMyHomeBoy commented 1 week ago

I am running ./utreexod --flatutreexoproofindex --prune=0

Attached is the config.
It's all default options except for "connect=192.168.1.30" to a local node on my LAN.

utreexod.conf.zip

HalFinneyIsMyHomeBoy commented 1 day ago

My server ran out of disk space and I had to do an unclean reboot to expand the disk space. When I ran it again I am getting the "adding orphan block" error loop again.
Almost made it blocks in the year 2022

(attached logs) utreexod.log.1.gz utreexod.log

cec489 commented 1 day ago

@HalFinneyIsMyHomeBoy I'm getting the same problems with syncing a bridge node. Slows to a crawl at about block 500k. (my system has 16G RAM)

I've been discussing this on Calvin's discord, if you want to join in.

Here in the utreexod channel https://discord.com/channels/1185232004506198056/1196568502203600977/125793707915805085

You might go faster if you set the max cache sizes to -1, since you have 108G ram, as described in the discord.

With only 16G, I get OOM errors.