dashpay / dash

Dash - Reinventing Cryptocurrency
https://www.dash.org
MIT License
1.49k stars 1.2k forks source link

mnw flood - high CPU load #578

Closed schinzelh closed 9 years ago

schinzelh commented 9 years ago

Hi,

this problem is getting worse the more masternodes are on v12.

I did a fresh startup, it took the node 6 minutes of 100% CPU load to parse through the complete masternode list

2015-08-27 08:37:27 Dash version v0.12.0.49-5410e0a (2015-08-25 13:39:04 -0700)

[....]

2015-08-27 08:43:40 CMasternodeSync::GetNextAsset - Sync has finished

At the time of startup there were 2638 masternodes in the list

~$ dash-cli masternode count
2638

debug.log contains 9922 occurences of

CMasternodePaymentWinner::IsValid - Masternode not in the top 10 and mnw - invalid message - Masternode not in the top 10 each.

If old v11 nodes are the cause of this, this is a DoS attack vector. In either case: This should not happen :)

schinzelh commented 9 years ago

Additionally it seems that this is exactly what most users report as "stuck syncronisation" (Win) or "spinning ball" (Mac) - they are impatient and do not wait the 6mins@100% for sync to complete...

poiuty commented 9 years ago

https://github.com/dashpay/dash/issues/560

schinzelh commented 9 years ago

@UdjinM6 can you please elaborate how this

actively syncing masternode list and masternode winners list with all its peers during startup.

is working? You are checking each of the 2638 masternode entries of e.g. 4 peers (=10550 entries) against the winners list? At least that would explain the nnw message flood in debug.log, but i don't get the rationale :)

poiuty commented 9 years ago

wait ~20min

# cat /home/dash/data/149.202.239.228/debug.log | tail -n 20
2015-08-27 15:06:32 CMasternodePaymentWinner::IsValid - Masternode not in the top 10 (12)
2015-08-27 15:06:32 mnw - invalid message - Masternode not in the top 10 (12)
2015-08-27 15:06:32 CMasternodePaymentWinner::IsValid - Unknown Masternode 1d60acb44f14c475301dc5346bf6eea700b2ea1f87f11a99822e2c7190809b9e-0
2015-08-27 15:06:32 mnw - invalid message - Unknown Masternode 1d60acb44f14c475301dc5346bf6eea700b2ea1f87f11a99822e2c7190809b9e-0
2015-08-27 15:06:32 CMasternodePaymentWinner::IsValid - Masternode not in the top 10 (11)
2015-08-27 15:06:32 mnw - invalid message - Masternode not in the top 10 (11)
2015-08-27 15:06:32 CMasternodePaymentWinner::IsValid - Masternode not in the top 10 (12)
2015-08-27 15:06:32 mnw - invalid message - Masternode not in the top 10 (12)
2015-08-27 15:06:32 CMasternodePaymentWinner::IsValid - Masternode not in the top 10 (11)
2015-08-27 15:06:32 mnw - invalid message - Masternode not in the top 10 (11)
2015-08-27 15:06:32 CMasternodePaymentWinner::IsValid - Masternode not in the top 10 (11)
2015-08-27 15:06:32 mnw - invalid message - Masternode not in the top 10 (11)
2015-08-27 15:06:32 CMasternodePaymentWinner::IsValid - Masternode not in the top 10 (11)
2015-08-27 15:06:32 mnw - invalid message - Masternode not in the top 10 (11)
2015-08-27 15:06:33 CMasternodePaymentWinner::IsValid - Masternode not in the top 10 (11)
2015-08-27 15:06:33 mnw - invalid message - Masternode not in the top 10 (11)
2015-08-27 15:06:33 CMasternodePaymentWinner::IsValid - Unknown Masternode 4f658e4ff0167c0e52b07ee9d501afb9bfd3e8e4b24bb12821c0c069f8ff1cda-0
2015-08-27 15:06:33 mnw - invalid message - Unknown Masternode 4f658e4ff0167c0e52b07ee9d501afb9bfd3e8e4b24bb12821c0c069f8ff1cda-0
2015-08-27 15:06:33 CMasternodePaymentWinner::IsValid - Unknown Masternode 07ad23ecf30a2417dd846d48853f1ce41e5afe906b49b5d25f96b340b40b675e-1
2015-08-27 15:06:33 mnw - invalid message - Unknown Masternode 07ad23ecf30a2417dd846d48853f1ce41e5afe906b49b5d25f96b340b40b675e-1
UdjinM6 commented 9 years ago

Right now it works smth like that: Wallet tries to get MN list and waits until there is no more new MNs come in predefined timeout to sync as much MNs as possible and avoid too many mnw rejections. Once there is no more new MNs wallet switch to MN winners list sync. After a switch it requests winners list from all its peers and same conditions with timeout applies. MN list actually suncs pretty fast but winners list is smth like (number of known MNs) * 2 * 10 * (number of peers) and it's huge (20+Mb of data imo). That many messages cause high CPU/disk usage on load and also can cause DDoS-like issues like timeouts for blocks download etc. There is no real way of skipping this step however besides having a cache (which we turned off for now because of weird issues it had) or storing winners in blockchain for example. I tried to minimize this effect in #573 but it's a half solution too.

schinzelh commented 9 years ago

Does the wallet do (number of known MNs) * 2 * 10 * (number of peers) single requests? How about requesting 500 entries each (like blockchain data and/or dseg) - of course we need a working banning mechanism or this...

UdjinM6 commented 9 years ago

Not quite that way. Wallet asks "hey, give me all winners for (number of known MNs) * 2 blocks back in history. Peer responds with inv for every record and if wallet hasn't received such inv yet it will ask for a full record then. If wallet already has this record then nothing happens.

Agree about banning mechanism but I would say we'd better find a better way for syncing first otherwise we'll be banning legit users for nothing.

eduffield222 commented 9 years ago

How about this?

https://github.com/dashpay/dash/commit/5cc8c79c7c43f0b7c32b917a45587c8a8f90dc31

UdjinM6 commented 9 years ago

:+1:

schinzelh commented 9 years ago

I'll give 12.1.0 a testflight :)

schinzelh commented 9 years ago

Seems related

well restart doesn't help, looks like i have to wait ~10min to go past that phase, strange

No it's not normal, it should be finished at 100%. Try close client and run it once again.

is it normal for v12 win wallet to stay always in THIS synch phase ? image

https://bitcointalk.org/index.php?topic=421615.msg12267764#msg12267764

schinzelh commented 9 years ago

Tested v0.12.1.0-5cc8c79

2015-08-28 18:47:16 Dash version v0.12.1.0-5cc8c79 (2015-08-28 08:53:08 -0700)
[...]
2015-08-28 18:49:54 CMasternodeSync::GetNextAsset - Sync has finished

Sync finished in 150 secs, CPU load is more or less flat during this time. Well done.

schinzelh commented 9 years ago

Tested v0.12.0.50-b1d28c9 on Windows@home VDSL2 line, sync time 3.5 mins