gridcoin-community / Gridcoin-Research

Gridcoin-Research
MIT License
588 stars 173 forks source link

Wallet does not sync #246

Closed HeyMerlin closed 7 years ago

HeyMerlin commented 7 years ago

Other messages sprinkled in the log: "03/23/17 23:27:46 .B..B..B..B..B..B..B. Received block [snip block numbers]; ERROR: ConnectInputs() : d307fb9aefb8c8cd2f444bba7972d24de2572f0ecab38aa95f0c856897ed2418 prev tx already used at (nFile=1, nBlockPos=279371346, nTxPos=279372033) 03/23/17 23:28:12 ERROR: AcceptToMemoryPool : Unable to Connect Inputs d307fb9aefb8c8cd2f444bba7972d24de2572f0ecab38aa95f0c856897ed2418"

on startup I get 10 of the following: "03/23/17 20:36:46 No neural network nodes online." after sending out bpk messages also "ERROR: Bad CPID" eror with no cpid showing in the wallet. Reseting the cpids seems to fix this.

I'm running the latest wallet version: Gridcoin version v3.5.8.6-g-research (12/1/2014) on Windows 10.

I have talked to a few others in #gridcoin that are having the same issue. Let me know what other debug information I can post.

HeyMerlin commented 7 years ago

this may be the same as issue #245 but I did not see the research paid too much error.

grctest commented 7 years ago

My windows client was/is stuck at 851005 with similar complaints about a bad CPID. It's reindexing currently.

gridcoinresearchd getblockbynumber 851005


Other windows client is stuck on 850845:

03/24/17 00:21:47 ERROR: ProcessBlock() : CheckBlock FAILED 03/24/17 00:21:47 Received block 73c63c113d021720b97720747c938f7dacd91885f473a2ed023fbbe8e6c9b912; Received block 4d3095b118a33efee3e57444ee9705adf59166088dec96e740dad19ccbe9607f; Received block 021c6a8ffef78037bfd508edcca8c522b68ff27ea0a96c5d271ae830577aa1ab; Received block f0f02d5c41e813b84c17495e055648dab9fbb6e49a75a33797659783a970736c; ERROR: Bad CPID : height 850845.000000, CPID 38c81a3752ba69614cd7f241bfb2d2d9, cpidv2

debug file: https://mega.nz/#!EdoywShZ!gap54lx3VC-vrbHASLC7n0r1pRIefMl-lxk4zirmU3Q

debug2 file: https://mega.nz/#!xMAwiSDB!eAwzmX09HyXd5A4ezX220G_odDIhrmxsTLDoi6qH-ak

Getblockbynumber 850845:

{
"hash" : "ab81c60c40beccb9a9a79c5886a4e43c8285cf5cf343ea04d0cd26c830a34d18",
"confirmations" : 1,
"size" : 773,
"height" : 850845,
"version" : 7,
"merkleroot" : "40c3469eaf178f0db4621966d5d6ff30061b5ddc828d2cf2144ed0c9b827c930",
"mint" : 6.85196963,
"time" : 1490225552,
"nonce" : 279365,
"bits" : "1c15b12d",
"difficulty" : 11.80135522,
"blocktrust" : "bcd316ad4",
"chaintrust" : "73e0f0cf86557b2bb40f58d5b9d39693c2c893c4fe568da6970e90610941d60",
"previousblockhash" : "9489b15ef33da8228295a1534aa731a192c54eac8c50746cfa532e50915b674e",
"flags" : "proof-of-stake stake-modifier ",
"proofhash" : "000208fea8361f37c43f39aef04b74811b98272a1a4d76969cb52dd3123171ad",
"entropybit" : 0,
"modifier" : "d48c10792d7aa972",
"modifierchecksum" : "68a45fed",
"tx" : [
"b2d149b7c0b6f48fc581b8ee70b84b705a5746186ca19431a022c91a1e2b4908",
"fa34b6d4da718d43cd814fcf6f7746d88c6fa9d9e541082a1d6c9e5dbb4e2476"
],
"signature" : "3045022100e1fda77120c426f89aa7e8d44baac94e2087f085a58c53c84a3dc02b7b1cda04022071aab336877bff6e0cf1e7a611d3f6683aaac087f99385626c8efe891f899d66",
"CPID" : "INVESTOR",
"Magnitude" : 0.00000000,
"BoincHash" : "INVESTOR<|><|><|>0<|>0.00000<|>0<|><|><|>0<|>0<|>v3.5.8.6-g-research<|>0.00<|>1490223027<|>0<|>b59c71815e3d8190ddf63e71da57ea52ca7080c87b7b77d7ce78806f787ed0d7<|>0<|>SAc91156A2bF6fGH1C1SUhfL6TEEVfAjSn<|>9489b15ef33da8228295a1534aa731a192c54eac8c50746cfa532e50915b674e<|>6.85<|><|><|><|><|>0.00<|>0.000000<|>0.000000<|>0.00<|>0<|>ee385f09744989d6b78f1fff90e97020<|><|>",
"LastPaymentTime" : "03-22-2017 22:50:27",
"ResearchSubsidy" : 0.00000000,
"ResearchAge" : 0.00000000,
"ResearchMagnitudeUnit" : 0.00000000,
"ResearchAverageMagnitude" : 0.00000000,
"LastPORBlockHash" : "0",
"Interest" : 6.85000000,
"GRCAddress" : "SAc91156A2bF6fGH1C1SUhfL6TEEVfAjSn",
"ClientVersion" : "v3.5.8.6-g-research",
"CPIDv2" : "b59c71815e3d8190ddf63e71da57ea52",
"CPIDValid" : true,
"NeuralHash" : "",
"IsSuperBlock" : 0,
"IsContract" : 0
}
HeyMerlin commented 7 years ago

Hmm, should have also mentioned I did try the reindex command and that did not seem to have an affect either. The only recover that I have heard of so far is to bootstrap or download blocks (same thing I believe).

iFoggz commented 7 years ago

i see lots of checkblock fails on my investor wallet then i tend to go out of sync and it recovers a long time later like 30 or more minutes.

ghost commented 7 years ago

Are there any dev nodes running on the live network that could be causing this? I noticed there is a pull request for CPID changes and some of the errors seem to be about having an invalid CPID.

iFoggz commented 7 years ago

grctest i run a linux wallet, i read through the link u referenced and yes i voted for 2nd proposal myself. about to go solo mining from pool mining as i bought just under 20000 grc just gonna wait till this sorted out before i proceed

grctest commented 7 years ago

My two full nodes are not in sync, stuck with this bad cpid error. My two windows clients fell out of sync and have begun reindexing. One linux client was on the latest block, here's the debug showing getting past the bad blocks: https://mega.nz/#!hB5CWSTJ!3eztVZuHKy8SFCya0GO2Un6J36TNwHKqaBskF3gynkE

denravonska commented 7 years ago

I got a bad CPID error from 0390450eff5f5cd6d7a7d95a6d898d8d in block 851315. It looks like I do not have that CPID's beacon key in my application cache (I think, still trying to print the cache map in the debugger) so the CheckMessageSignature call fails.,

This is the received block: http://pastebin.com/GtBuquA4

Edit: I just checked and it seems like that, and only that, CPID does not have a beacon key in my cache:

"beacon;0390450eff5f5cd6d7a7d95a6d898d8d" : ""

The empty string is inserted in the call to IsCPIDValidv2.

Edit: Worth mentioning is that I've seen this "Bad CPID" many, many times before. Long before the dev branch took off.

ghost commented 7 years ago

After running from the latest snapshot here is a list of the bad CPID's from my log

Bad CPID : height 798253.000000, CPID e28c601da191540ebdac625c2cb87b85
Bad CPID : height 815048.000000, CPID f6e9b9c140a2363fd623bfe0048f2425
Bad CPID : height 815049.000000, CPID f6e9b9c140a2363fd623bfe0048f2425 
Bad CPID : height 816449.000000, CPID f6e9b9c140a2363fd623bfe0048f2425
Bad CPID : height 818112.000000, CPID f6e9b9c140a2363fd623bfe0048f2425
Bad CPID : height 819449.000000, CPID f6e9b9c140a2363fd623bfe0048f2425
Bad CPID : height 820252.000000, CPID f6e9b9c140a2363fd623bfe0048f2425
Bad CPID : height 820916.000000, CPID 8ebb0af3ad4ece5ca746ef7d629323fd
Bad CPID : height 821388.000000, CPID f6e9b9c140a2363fd623bfe0048f2425
Bad CPID : height 822388.000000, CPID f6e9b9c140a2363fd623bfe0048f2425

All CPIDs are in beacon report.

philipswift commented 7 years ago

I ran 'execute beaconstatus' and it failed so I did 'execute advertise' beacon and got 'success' then ran 'execute beaconstatus' and it failed. I got this after I moved to pool mining at grcpool.com but it is probably coincidence. IRC chat was about wrong CPID's when going to the pool and most people left and went back to solo mining. This is not a slight on grcpool.com at all. The work done there looks amazing. Wallet syncs upto 20th March at 19:22 GMT. Maybe Linux should be stated under best practise as the more stable of the O/S's :)

denravonska commented 7 years ago

My main Linux wallet shows the same behavior :(

philipswift commented 7 years ago

I seem to remember someone saying that testnet work was showing up in BOINC stats? Maybe try and backtrack where it's source is if indeed, that is the case.

gridcoin commented 7 years ago

Im not much help right now- Im traveling on business, unable to respond to the emails sent by CM to the inbox, laptop malfunctioning, conf call in 15 mins, having some IT problems.

I just tried to sync GRC on my laptop and it stopped at block 851658.

Will see if I can be any help after the call.

grctest commented 7 years ago

My full nodes (amsterdam/frankfurt) are now fully in sync w/ the explorers (used snapshots). One of my windows nodes reindexed overnight and got past the stuck block, I've not checked my other node yet.

Quezacoatl1 commented 7 years ago

I got my Win node synced by using the downloadblocks feature.

denravonska commented 7 years ago

Probably two different issues, but are we sure that the auto renewal of beacons work if there is one? If I check my block chain, the newest beacon I have for 0390450eff5f5cd6d7a7d95a6d898d8d is from 18 sep last year (1474230935, block 670665), almost exactly 6 months ago. Does that mean that it gets ignored when loading since LoadAdminMessages only checks 6 months back?

philipswift commented 7 years ago

All good here, timestamp reporting correct, still syncing

denravonska commented 7 years ago

@philipswift Just don't restart :)

philipswift commented 7 years ago

Too late! I didn't restart and the timestamp is done for :( I did rebuild block chain and it came back up bad

 [ { "Command" : "beaconstatus" }, { "CPID" : "cca8548e6f66693686316f593c8702e2", "Beacon Exists" : "No", "Beacon Timestamp" : "1-1-1970 00:00:00", "Public Key" : "", "Private Key" :..., "Local Configuration Public Key" : "04322ad753a15c42d76b0eda4c75f1419cb431355bfba1db5881f49718c6b6d6663d7049b19d9255176b8e429addc06674b4463137afc76fa1d26d4cd94c71ac59", "Magnitude (As of last superblock)" : 0.00000000, "Warning" : "Your magnitude is 0 as of the last superblock: this may keep you from staking POR blocks.", "Errors" : "Public Key Missing. ", "Help" : "Note: If your beacon is missing its public key, or is not in the chain, you may try: execute advertisebeacon.", "Configuration Status" : "FAIL" } ]

CMOS battery failing somwhere I reckon. Skews all time and date stamps

Quezacoatl1 commented 7 years ago

@philipswift you should remove your private key from any post

denravonska commented 7 years ago

@philipswift Did it for you.

gridcoin commented 7 years ago

@denravonska , Yes correct, LoadAdminMessages only loads the latest 6 months (to save memory). So old beacons drop off naturally.

gridcoin commented 7 years ago

Im downloading blocks to see if I can get past the bad cpid- if so, Ill create a new snapshot after I sync. Checking...

denravonska commented 7 years ago

@gridcoin It won't discard any beacons once they're loaded, so they keep getting accepted until a restart, right? I'm thinking a snapshot won't solve anything if the beacons are still old and new ones aren't being sent out.

Edit: If that's the case, won't new beacons get ignored until restart? https://github.com/gridcoin/Gridcoin-Research/blob/master/src/main.cpp#L9865

philipswift commented 7 years ago

Unix time is linked to '00:00 1970' https://en.wikipedia.org/wiki/Unix_time

Found this https://cryptocointalk.com/topic/40141-testnet-research-age/page-41 I think it may be a Dev issue :P - rollback?

philipswift commented 7 years ago

@Quezacoatl1 Did I publish it? I didn't know. My public key is up there. Thank you anyway :)

Quezacoatl1 commented 7 years ago

My Win 10 node just crashed with

getblocks 851032 to a01450b999f8f0c8dbd6 limit 1000
03/24/17 15:02:06

  getblocks stopping at 851398 a01450b999f8f0c8dbd6
03/24/17 15:02:06

getblocks 851032 to b618965e974806dc8dc7 limit 1000
03/24/17 15:02:06

  getblocks stopping at 851401 b618965e974806dc8dc7
03/24/17 15:02:08

Signing Block for cpid 1a7c8f7bbe5a14b0c1e205af3bd2d3d5 and blockhash 72092509353562cd0d04cf1129f4a29fd618469c5953ed5efe854ee5495f03fa with sig MEUCIQC1zq+ApVPXEQ6HVWWwdk18P0P0Y0hB7dEPASB6g5rWngIgGFP4mZA30C26kCw6ZQxeNq2n5nVUhlgFwpqu5+i66zk=

And now I am stuck again at 852337. Maybe this has to do with this issue as my own CPID (1a7c8f7bbe5a14b0c1e205af3bd2d3d5) will drop off the chain soon? I have sent the last beacon on September 19.

As we are talking about beacons anyway (a little off-topic)... what happens if a beacon with a magnitude drops off the chain and is re-activated a few months later? Will the research age count the time between dropping off and re-activation or will this time be ignored? I was wondering why this guy in #108 recieved this huge amount of payment after re-activation of his beacon so I decided to test my theory that the time between is counted into research age. Shouldn't the research age be reset once a beacon drops off the list?

@philipswift: @denravonska deleted it for you from your post.

philipswift commented 7 years ago

@denravonska Re: Did it for you Last block time Sun 16. Nov 16:42:30 2014 execute beaconstatus returns '"Errors" : "Public Key Missing. ",' Is the fact the public key missing information or an actual symptom?

grctest commented 7 years ago

My windows node was working, but got stuck again:

03/24/17 17:08:22  Received block ee58d8e606373cb1f284a9d159ddffaa051f78d356f4c916ce7c1aefdcc9675a; ERROR: Bad CPID : height 852352.000000, CPID 8ed4ce08bfd7cb8a436eef5fc3be322f, cpidv2 8ed4ce08bfd7cb8a436eef5fc3be322fc869383a323b6437386cc9356769c8963f953e3d6c3f38c667423c6ac63f436966682f777073756264417674622f6f6675, LBH 3c715bdf7ac4a412d2f9a1c08753be8314b5751c886b8c76fe669bca618b5d26, Bad Hashboinc 8ed4ce08bfd7cb8a436eef5fc3be322f<|><|><|>0<|>0.00000<|>0<|><|><|>0<|>0<|>v3.5.8.6-g-research<|>13.45<|>1490371360<|>0<|>8ed4ce08bfd7cb8a436eef5fc3be322fc869383a323b6437386cc9356769c8963f953e3d6c3f38c667423c6ac63f436966682f777073756264417674622f6f6675<|>1280<|>S9DUUt5pELCuijJXM4fXYSiMTZqdqfX2gg<|>3c715bdf7ac4a412d2f9a1c08753be8314b5751c886b8c76fe669bca618b5d26<|>0.00<|><|><|><|><|>13.45<|>0.042037<|>0.250000<|>1280.00<|>c0c9aa8e9f400143b96fe9a45f52cef5f02bf0f163ab7c767bd9a4363b03e2f4<|>b100360835cd4dbc94004c430c11a6e0<|>045aedf3fe8f9a07b16b0ff44f55d33734034bb653b878342454765ab55848c40c18ae68f69118093735f21847d4f99105ee884311ef36918b7b5718e23a3d3225<|>MEUCIQDdJCBnWejgSXEPepK2Kv1hV1emuf51vEmmHZ/cb9FO1gIgUrV1qiaX2MWboG8wa4v/jZ4P6T1nc2VOGlOhmlxD6FE=
03/24/17 17:08:22 ERROR: ProcessBlock() : CheckBlock FAILED
03/24/17 17:08:25  Received block 21681a5aacdc36e0e893c9359568cd0d4d6e3bed79a7447c67b3b6049595b70e; ERROR: Bad CPID : height 852352.000000, CPID 0b5ef259411ec18e8dac2be0b732fd23, cpidv2 0b5ef259411ec18e8dac2be0b732fd233b6e663f96363994c569cb3967c66f3238973f39c9346cc83ac2363798389436747562737465767475416a6f6370792f6d77, LBH e4b44e23065bc38588b7cc12936791f771fab54818bb9ba7a92dea8b84ac41f8, Bad Hashboinc 0b5ef259411ec18e8dac2be0b732fd23<|><|><|>0<|>0.00000<|>0<|><|><|>0<|>0<|>v3.5.8.6-g-research<|>8.52<|>1490367280<|>0<|>0b5ef259411ec18e8dac2be0b732fd233b6e663f96363994c569cb3967c66f3238973f39c9346cc83ac2363798389436747562737465767475416a6f6370792f6d77<|>410<|>S83FdFPoHJ9gAXeknKPL39gyHXdjuPyt3b<|>e4b44e23065bc38588b7cc12936791f771fab54818bb9ba7a92dea8b84ac41f8<|>3.04<|><|><|><|><|>8.52<|>0.083148<|>0.250000<|>410.00<|>a20f03ab61727db60bd1bb2639f3ea8d99e5209f7bc047bd0e4fb468ee04f8d6<|>7b5b5c167673d876c8dbd38cb49ddce6<|>04efe29df4c16e2420f84589ea4c9b73097335c2c091231b76286a64a22c5d37407c21cfcc37977246cfc10a5dbd101333bf3b59be21ae66e0124f991c731db749<|>MEUCIQD1QxhYWy/oLxoCImFWnLbsSVYm1nkzSUa6RutGlbHxfgIgSzE15TbYkSolqJfvhHmdX3ln0Uouoj2uDZZGma4VOI8=
03/24/17 17:08:25 ERROR: ProcessBlock() : CheckBlock FAILED

Comparison (2 bad first, compared to 2 last blocks):

Hashboinc 0b5ef259411ec18e8dac2be0b732fd23<|><|><|>0<|>0.00000<|>0<|><|><|>0<|>0<|>v3.5.8.6-g-research<|>8.52<|>1490367280<|>0<|>0b5ef259411ec18e8dac2be0b732fd233b6e663f96363994c569cb3967c66f3238973f39c9346cc83ac2363798389436747562737465767475416a6f6370792f6d77<|>410<|>S83FdFPoHJ9gAXeknKPL39gyHXdjuPyt3b<|>e4b44e23065bc38588b7cc12936791f771fab54818bb9ba7a92dea8b84ac41f8<|>3.04<|><|><|><|><|>8.52<|>0.083148<|>0.250000<|>410.00<|>a20f03ab61727db60bd1bb2639f3ea8d99e5209f7bc047bd0e4fb468ee04f8d6<|>7b5b5c167673d876c8dbd38cb49ddce6<|>04efe29df4c16e2420f84589ea4c9b73097335c2c091231b76286a64a22c5d37407c21cfcc37977246cfc10a5dbd101333bf3b59be21ae66e0124f991c731db749<|>MEUCIQD1QxhYWy/oLxoCImFWnLbsSVYm1nkzSUa6RutGlbHxfgIgSzE15TbYkSolqJfvhHmdX3ln0Uouoj2uDZZGma4VOI8=

8ed4ce08bfd7cb8a436eef5fc3be322f<|><|><|>0<|>0.00000<|>0<|><|><|>0<|>0<|>v3.5.8.6-g-research<|>13.45<|>1490371360<|>0<|>8ed4ce08bfd7cb8a436eef5fc3be322fc869383a323b6437386cc9356769c8963f953e3d6c3f38c667423c6ac63f436966682f777073756264417674622f6f6675<|>1280<|>S9DUUt5pELCuijJXM4fXYSiMTZqdqfX2gg<|>3c715bdf7ac4a412d2f9a1c08753be8314b5751c886b8c76fe669bca618b5d26<|>0.00<|><|><|><|><|>13.45<|>0.042037<|>0.250000<|>1280.00<|>c0c9aa8e9f400143b96fe9a45f52cef5f02bf0f163ab7c767bd9a4363b03e2f4<|>b100360835cd4dbc94004c430c11a6e0<|>045aedf3fe8f9a07b16b0ff44f55d33734034bb653b878342454765ab55848c40c18ae68f69118093735f21847d4f99105ee884311ef36918b7b5718e23a3d3225<|>MEUCIQDdJCBnWejgSXEPepK2Kv1hV1emuf51vEmmHZ/cb9FO1gIgUrV1qiaX2MWboG8wa4v/jZ4P6T1nc2VOGlOhmlxD6FE=

"BoincHash" : "INVESTOR<|><|><|>0<|>0.00000<|>0<|><|><|>0<|>0<|>v3.5.8.6-g-research<|>0.00<|>0<|>0<|>b59c71815e3d8190ddf63e71da57ea5271d4776ed476d277cc6fd8cb7bd87079<|>0<|>S3n4sadaTA3Y6Ub6HYsvn1tCugoT7ACPmr<|>6b580216c43809ad4ef7d56bd73fb69260bcc8c81caf7cdedf0f1415541011a7<|>4.83<|><|><|><|><|>0.00<|>0.000000<|>0.000000<|>0.00<|>0<|>a8100c69dd503948c4249fa51146ffdc<|><|>",

"BoincHash" : "2bebcc51ce6b307d8410ba59a9072039<|><|><|>0<|>0.00000<|>0<|><|><|>0<|>0<|>v3.5.8.6-g-research<|>6.31<|>1490365984<|>0<|>2bebcc51ce6b307d8410ba59a9072039386941923642633a326533976cc76535c7686d99c59535383e983c9a3637966971707170776a642f6c736a74756a626f41686e626a6d2f64706e<|>570<|>SAJ2SnZwR3bECYPNSrM4hDpU6NvGBT2BBp<|>45c4a8bf8ab640e36fd259bbd35be42f82a8dfefe939ba1aa8e3a9f54807d338<|>3.17<|><|><|><|><|>6.31<|>0.044259<|>0.250000<|>570.00<|>065804824378f9af88ea8a03ae5e94f15238352d43d9956e7043c17f79f55309<|>6f49be9c1f2682cf68085dfbbc15a29c<|>042fc5662cc5c91f441af46fe2b9a343bc26b64e5431f0fb394e2e246a3db454b7ccdc32f69a68e15bad09603c5c3af1755d47a1deaa612afda2d939d59714d1da<|>MEUCIQDDddbOsKK+xbyPkRv6TWbncqbq1eL0GiyvQnBjU2PIBgIgcoh5QH2XkMhg4a4ru1Nz40KrzMUp+uPws/+uf/YxOFE=",
sEpuLchEr commented 7 years ago

A new snapshot will not work. Somewhere after 850xxx, our public keys dropped off the chain.

ghost commented 7 years ago

Once you get in Sync don't restart the wallet.

joshuaferguson commented 7 years ago

After talking to Ravon and looking at the code the issue appears to be associated with CPIDs whose beacon is greater than 6 months old. When validating a CPID, it looks back 6 months to try to find the advertised beacon, and when it doesn't find it it's invalid. In other words:

CPID 1 has a beacon that's 6 months old or older, and stakes a block User 2 receives the block containing CPID 1, and looks back through the past 6 months of transactions and doesn't find a valid beacon, so claims that CPID 1 is invalid, stops syncing and has to be re-snapshotted. The snapshot works because it starts with a much older block, that contains CPID 1. THAT block looks back 6 months from when IT was created, finds the beacon (because the beacon was valid when that block was created), and stores it in memory. Now that CPID 1 has been validated it will be found valid in later blocks because we already have a "valid" beacon. As long as the cached beacon exists in memory, it works. As soon as you restart, it finds a new block with CPID 1, looks back 6 months and doesn't find the beacon, so rejects it.

Sorry if that explanation is a bit redundant but summary is it has to do with caching valid CPIDs, and it seems ANY CPID that hasn't advertised their beacon in the past 6 months could potentially break syncing for all clients.

Mind you, this is just a theory so don't consider this solved until we can validate this is the case.

tomasbrod commented 7 years ago

Expiring beacons are not the main problem. The fatal flaw is that some nodes are accepting a these blocks while others aren't. (depending when they were restarted). And there are more nodes that will accept this kind of invalid blocks, which is bad. Wallet must check that the cpid beacon is not expired before accepting a block. Once this check is implemented, I see a mayor fork in chain. Because all those blocks that make your wallets halt will be rejected.

grctest commented 7 years ago

Relevant idea: https://github.com/gridcoin/Gridcoin-Research/issues/183

If users concerns that a fix would put the blocks with 'bad' fields at risk are plausible, what about instead moving to a (provable) burn address for beacons/registration transactions? This way we would only look up a single address instead of looking back 6 months & there would be consistency across all nodes?

iFoggz commented 7 years ago

ah crap, updates my system stuff and restarted then added email and want to adverise my beacon as im going from pool to solo but stuck on block 852718 all because of a restart sigh

tomasbrod commented 7 years ago

src/main.cpp:9801, replaced (BLOCKS_PER_DAY*30*6) with (BLOCKS_PER_DAY*30*8). This is quick and temporary fix to make your wallet sync after restart.

philipswift commented 7 years ago

@tomasbrod How does one apply this? Update client, command console or conf file entry? I just ran a GRC client update and it said about a change - the the proof of ownership was dentralised. I have coins going back to 2014 maybe earlier.

startailcoon commented 7 years ago

Here are some logs related to the error about expired CPID Beacon that can't be verified.

The wallet runs fine and then it gets stuck on

03/25/17 11:36:59 Received block 43370ce413a94a44481d2ca8ba76689c40d2ed1415a228b8c1de9ac9f489787e; Last staked block found at height 483833.000000, but cannot verify magnitude older than 6 months! 03/25/17 11:36:59 [DoTallyRA_START] [DoTallyRA_END] 03/25/17 11:36:59 Last staked block found at height 483833.000000, but cannot verify magnitude older than 6 months! 03/25/17 11:36:59 ERROR: CheckBlock[ResearchAge] : Researchers Reward Pays too much : Interest 0.020000 and Research 2.500000 and StakeReward 0.000000, OUT_POR 0.000000, with Out_Interest 0.000000 for CPID 307f6656d708c429a6c4174114fc834 d 03/25/17 11:36:59 ERROR: ProcessBlock() : CheckBlock FAILED

This seems to be at random positions for everyone tough.

Related Code Lines: ProcessBlock() fails at main.cpp#L5113 This is due to the fact that CheckBlock[ResearchAge] fails at main.cpp#L4350

The wallet CAN recover by itself. See attached document at the end. After 30 minutes the node recovered the latest block again and started fetching new blocks after some errors of already having the block.

See attached document: checkBlockFailed.txt

denravonska commented 7 years ago

If we drop the 6 month limit the memory requirements for the app cache (where the beacons are stored) goes up from 3.something to 4.58 megs. I think we could live with that right now given that the block index takes well over 300 megs. If the beacons turn out to be an issue we can always optimise the storage since they're stored in delimited ASCII strings right now.

iFoggz commented 7 years ago

after 6 hours my wallet did not recover from stuck block for checkblock fail. i had to delete peers.dat blk file and txleveldb and database and download snapshot and resync that way.

dopeshitnetworks-irc-dopeshit-net commented 7 years ago

@philipswift to apply that would require downloading the wallet source code and editing code. Inside the source code archive in the directorry ~/GridcoinResearch/src is the file " main.cpp " and down at line #9801 is the line " (BLOCKS_PER_DAY306) " and you would edit/replace that line of code with (BLOCKS_PER_DAY308) apparently " i didn't try it " but I hope that helps.. It's editing the raw code before all files are " compiled " and built together to make the wallet client itself.

grctest commented 7 years ago

If we drop the 6 month limit the memory requirements for the app cache (where the beacons are stored) goes up from 3.something to 4.58 megs.

Sounds like a good idea for just 1.58MB RAM cost. Thoughts, @gridcoin ?

I attempted to use the new snapshot supplied by @gridcoin, and experienced this same issue:

03/25/17 20:27:09 ERROR: ProcessBlock() : CheckBlock FAILED
03/25/17 20:27:17  Received block 0ae7641e21a37bb37935bf7fd8a49381316fd688fc8a184f094a078ec8103c39;  Received block 0ae7641e21a37bb37935bf7fd8a49381316fd688fc8a184f094a078ec8103c39;  Received block 0ae7641e21a37bb37935bf7fd8a49381316fd688fc8a184f094a078ec8103c39;  Received block 0ae7641e21a37bb37935bf7fd8a49381316fd688fc8a184f094a078ec8103c39;  Received block c347df33d3bdec2d6e17697d58737d3edcf266734e26daea7834c02c6b892c13; ERROR: Bad CPID : height 853189.000000, CPID 3b6dc2071547081fde8af9364f760c37, cpidv2 3b6dc2071547081fde8af9364f760c3768c8333d699697696842623b6e976c3ac93e6e686a966f943c3c3c3c94383b9864747570736c41686e792f6566, LBH e9a7de8cf0c3afeaed42ee4e6d4036cd084fdf8b06ba0a93991763c957cc5b64, Bad Hashboinc 3b6dc2071547081fde8af9364f760c37<|><|><|>0<|>0.00000<|>0<|><|><|>0<|>0<|>v3.5.8.6-g-research<|>34.04<|>1490405920<|>0<|>3b6dc2071547081fde8af9364f760c3768c8333d699697696842623b6e976c3ac93e6e686a966f943c3c3c3c94383b9864747570736c41686e792f6566<|>175<|>S32TQcGuXBoY8AMGR46RrynAJ2wojLLhiG<|>e9a7de8cf0c3afeaed42ee4e6d4036cd084fdf8b06ba0a93991763c957cc5b64<|>3.08<|><|><|><|><|>34.04<|>0.777963<|>0.250000<|>175.00<|>6f2539c9e17e135cd0429e9c81407923cf9fe7ca82841119794ddbf7c8f93048<|>6854385db933bfc36497b1c452b8341d<|>04adb6edbea8fb0de9ed2fe06d2ff3f26859718f16ba8164ee697613df0af1fe459130411084d4c44a76390c1e6b1a48fec24349786268db209d2fe3b4911c98c9<|>MEUCIQDFb+QNQsYB30FA867NG0+43/StOVvpzjcOumn0feBx1wIgZzCnmXdXICw/9b9TtHBi21SBnGfAiESbiAt88bhLxVY=
03/25/17 20:27:23 ERROR: ProcessBlock() : CheckBlock FAILED
ghost commented 7 years ago

The snapshot has to be before block 850000 to work.

philipswift commented 7 years ago

Ethereums wallet for Windoze 64 is 85.7MB and our Windows version is 17.7 MB so cleaning up and flowing the code should not mean that total size compromises function. If anything crops up that benefits from more, we can list it in the 'minimum spec' and caveat for that.

denravonska commented 7 years ago

I was incorrect, the RAM increase without the limit is 2.8 -> 4.58 megs.

tomasbrod commented 7 years ago

I am against lifting the 6 month limit, @denravonska. Reason: Wallet would have to (1) scan all the blocks on startup and (2) keep all past cpids in memory, including long inactive ones. The wallet should properly (3) check new blocks if their miner's beacon expired. But that (3) would create a fork. In fact we all are on a fork. One way to avoid this fork is to (4) let blocks before block number xxx pass even if their miner's bacon is up to 8 months old. Or just (5) change the limit to 8 months. Anyway, we need a mandatory update and quick!

denravonska commented 7 years ago

@tomasbrod The RAM issue shouldn't be that big of a deal but you're right about the load times. With the current approach it takes 8.5 seconds to load the beacons on my Core2 with a mechanical disk. Scanning all blocks takes 44.2 seconds.

startailcoon commented 7 years ago

Some more digging in the code, for those who are interested and want to know about the reason for this issue.

Function GetBeaconPublicKey() fails at rpcblockchain.cpp#L3821 This makes function VerifyCPIDSignature() fail at rpcblockchain.cpp#L3833 These fails are the reason that the function CheckBlock() triggered an error at main.cpp#L4383 and the block is rejected

The function GetBeaconPublicKey() are returning nothing because sBeacon.empty()) are true. The program are unable to find the beacon in sBeacon = mvApplicationCache["beacon;" + cpid]; on rpcblockchain.cpp#L3823

The reason mvApplicationCache does not have the beacon in question is because of function LoadAdminMessages() only caches roughly 6 months back.

Solution 1: Commenting out main.cpp#L10280 would make the client cache ALL CPIDs since the genesis block. This is the same as running the client constantly so no CPIDs are ever forgotten. This is the reason why some clients are affected and some are not.

CONCERNS: I haven't looked deeper in the code, so I can't say what this does to the rewards structure. If this is the function for making sure that none with a Beacon older than 6 months are rewarded (for any reason) we are disabling this function. This is however not a good way to make sure this is enforced if the single client can decide this.

Solution 2: Making the client gracefully deny forgotten CPIDs instead of making it fail the block. If a CPID is old enough to not be included (6 months now) for any reason the client should gracefully deny/accept the block.

UPDATE My theory of why the recent block rejects have been happening are because nodes never forget CPIDs that they have in their cache. But they never load older than 6 Months back. This results in some of the unexpected issues.

Scenario:

When Node1 accepts a block with a CPID that haven't sent a beacon from 9 months ago it will accept it because it's in the cache. It will then send this block to Node2, but it doesn't have the CPID in cache because it's older than 6 months and was never loaded. Node2 will then reject the block, getting stuck in the network. All nodes that have the CPID in cache will accept it, but anyone in the same position as Node2 will not.

What we need is a way to drop CPIDs from the cache when they are to old, and all clients need to do the same. We need to re-cache the CPID cache every single block, or store the beacon blockid/timestamp alongside in the cache dropping them as we go.

tomasbrod commented 7 years ago

@startailcoon You said basically what I did, but you expanded the text and explained it better (+1). Implementing your last paragraph (dropping old cpids) is not hard to program. Just add timestamp to cpid and simple condition to CheckBlock or ProcessBlock. That would prevent similar invalid blocks ever entering the chain. But the real problems is, they are already in the current chain! Any nodes loading from 0 will get stuck and fork.

startailcoon commented 7 years ago

@tomasbrod Yeah, the comment mainly summarizes most of the things found during these past days, it just grew with findings, conclusions and some explanations. Credit to everyone for the work done. Rob is on the case and will probably update us as soon as he are able.