helium / miner

Miner for the helium blockchain
Apache License 2.0
609 stars 265 forks source link

failed to dial challenger : not_found #1407

Closed wolasss closed 2 years ago

wolasss commented 2 years ago

Recently, I have noticed again an increased number of errors "not_found" while connecting to the challengers:

https://pastebin.com/daDTBu9P

This is a log from one miner, but I can provide many more examples if needed - recently it happens much more often than before

DimitrisSar commented 2 years ago

more data:

General Witnesses Overview

Total witnesses = 904 (37.53/hour) Succesfully delivered = 684 (75.66%) Failed = 220 (24.34%) ├── Max retry = 219 (24.23%) └── Crash/reboot = 1 (0.11%)

Max Retry Failure Reasons

Timeout = 183 (20.24%) Not Found = 26 (2.88%) Other challenger issues = 9 (1%)

Challengers

Not Relayed = 454 (50.22%) Relayed = 259 (28.65%) Unknown (Probably Not Relayed) = 191 (21.13%)

uptime: 1 day 13 mins same config as before (300k/100/50) peer book -c : 182.895

RX packets 9940183 bytes 2460916324 (2.2 GiB) TX packets 24921364 bytes 33830618446 (31.5 GiB)

continuing on https://github.com/helium/miner/issues/1407#issuecomment-1064428447

Ay0hCrypto commented 2 years ago

@DimitrisSar @Bfindlay we're seeing pretty similar results. less success with lower intervals, but also less not_found.

Total Witnessed: = 65 |-- Sending: = 65 |-- Times Retried: = 171 Successful: = 53 ( 81%) - -down from 90% at 600000 interval Resent: = 70 Max Retry: = 12 ( 18%) Challenger Issues: |-- Challenger Not Found: = 91 (4%) - -down from 7% at 600000 interval |-- Challenger Timed Out: = 69 - -down |-- Challenger Refused Connection: = 10 - +up |-- Challenger Unreachable: = 1 |-- Challenger No Listening Address: = 0 20.5 hours SenseCAP m1 4gb, 80/20/300000 docsis 3.1 175dls/15uls ~ ethernet. 30% of successful witness receipts on chain. (i'm going to try and track back through the data and follow the amount going on chain though I doubt their will be any correlation, just wanted to track it, since people have claimed only 40% of witnessing doesnt make it on chain, where i saw >90% not going on chain prior to these changes)
WE stopped the testing on this at 21 instead of 24 hours as not even 30/30 successful beacons in 3 hours(far above what we see) was going to change the math. it ended with 81% success 4% not_found. now testing 60/20 at 600,000. (we ran 12/4 on a different miner for about 5 hours, and saw some amazing numbers with far less peer traffic, and = success to the 80/20 75/25 85/35 at 600,000 tests.

DimitrisSar commented 2 years ago

@Ay0hCrypto In my case, with 1.112 beacons heard, 833 (74,91%) successfully delivered, I got rewarded for only 43 (3,867%) witness reports (I assume that this is what you mean by "on-chain".

Time period: 31 hours. Peer book size: 197.363

I am at the outskirts of a large capital city in EU where there is a lot of "competition" for witness reports :)

I will swap from 300k/100/50 to 600k/100/50 to see if I get better results with the overall "Successfully delivered" %.

The "Not Found" has been reduced down to 3,51% from the total (39/1.112) but I believe that the total number of "Failed" has gone up (279/1.112). this could be RNG also (many timeouts that cannot be prevented)

I am on an FTTH 1Gbps symmetric link (and Controllino hotspot is linked with Ethernet at 1000 FD )

adrianformunda commented 2 years ago

Hello

That means that you have only 5.16% on chain. That is on the lower limit. Currently I have around 7.5%

If on chain mean the lucky ones from the successfully transmitted

Cheers

On Fri, Mar 11, 2022 at 5:55 PM DimitrisSar @.***> wrote:

@Ay0hCrypto https://github.com/Ay0hCrypto In my case, with 1.112 beacons heard, 833 successfully (74,91%) delivered, I got rewarded for only 43 witness reports (I assume that this is what you mean by (on chain)

I am at the outskirts of a large capital city in EU where there is a lot of "competition" for witness reports :)

I will swap from 300k/100/50 to 900k/100/50 to see if I get better results with the overall "Successfully delivered" %.

The "Not Found" has been reduced down to 3,51% from the total (39/1.112) but I believe that the total number of "Failed" has gone up (279/1.112). this could be RNG also (many timeouts that cannot be prevented)

— Reply to this email directly, view it on GitHub https://github.com/helium/miner/issues/1407#issuecomment-1065242848, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXJSHBFAPOGVLTMOUBNYWJTU7NUILANCNFSM5NEOALNA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

Ay0hCrypto commented 2 years ago

80/20/300000 24 hours right after i said 30% were getting on chain, it dropped drastically ended with only 18.5% making it on chain Total Witnessed: = 77 |-- Sending: = 77 |-- Times Retried: = 206 Successful: = 63 ( 81%) Resent: = 85 Max Retry: = 14 ( 18%)

Challenger Issues: |-- Challenger Not Found: = 108 |-- Challenger Timed Out: = 87 |-- Challenger Refused Connection: = 10 |-- Challenger Unreachable: = 1 |-- Challenger No Listening Address: = 0

Total Peer Activity: = 497 |-- Timeouts: = 150 ( 3.0%) |-- Proxy Session Timeouts: = 10 ( .2%) |-- Relay Session Timeouts: = 79 ( 1.5%) |-- Normal Exit: = 83 ( 1.6%) |-- Not Found: = 180 ( 3.6%) |-- Server Down: = 23 ( .4%)

Ay0hCrypto commented 2 years ago

@DimitrisSar Yes by on chain that is what i meant, and I agree, so far any change under 3% i usually dismiss as variable in network conditions/rng/luck/etc. 1/14 (7.1%) so far has been my worst since altering the sys.config variables. 1/20 (5%) on chain was my worst prior to altering so not really anything solid. Should also note: We've been testing two Sensecap's 1 with 4g the results i've posted and the other with 8gb ram. The resut's have been within 1.2% of each other so I've only posted my logs, since I'm in the denser area, which is obviously less densely populated with miners than your areas, or the areas you are witnesses. he see's less than 100 on high volume days. Where on high volume days i usually see around 150 in the logs, but since the 28/2-3/1 updates took a lot of miners off-line i've seen sub 100 but close. at the 24hour mark. also why we jumped the test time up from 4-8hour to 22-24 hours. trying to pull %'s when the numbers were sub 100 gave too much weight per failure.

DimitrisSar commented 2 years ago

Changed my settings (increased refresh rate, lowered inbound and gossip connections) from 300k/100/50 to 180k/50/20 and here are the stats for the last 10 hours:

General Witnesses Overview

Total witnesses = 508 (51.43/hour) Succesfully delivered = 362 (71.26%) Failed = 146 (28.74%) ├── Max retry = 144 (28.35%) └── Crash/reboot = 2 (0.39%)

Max Retry Failure Reasons

Timeout = 104 (20.47%) Not Found = 39 (7.68%) Other challenger issues = 1 (0.2%)

Challengers

Not Relayed = 233 (45.87%) Relayed = 142 (27.95%) Unknown (Probably Not Relayed) = 133 (26.18%)

Errors due to "max_retry" -> "not_found" = 7,68% of total Only 71.26% succesfully delivered The majority of failures are due to "Timeout"... nothing we can do here to improve this.

Only 14 witness reports were rewarded (on-chain) If on-chain = rewarded / total 100 = 2,756 % If on-chain = rewarded / delivered 100 = 3,867 % (horrible with both possible interpretations :)

eth0: RX packets 2262262 bytes 1085688083 (1.0 GiB) TX packets 4453171 bytes 5558207615 (5.1 GiB)

docker.config: {peerbook_update_interval, 180000}, {max_inbound_connections, 50}, {outbound_gossip_connections, 20}

peer book -c: 235.015

I will let it run a full 24 hours and then try 300k/200/50

@Ay0hCrypto

mikejobo commented 2 years ago

I have a Panther X2 with this issue, when i try to pull up the container it says no such container with the name miner? how do i go about figuring out what the container name is?

I want to edit my values and see if i get better results for max inbound and outbound connections.

Ay0hCrypto commented 2 years ago

It'll take me a minute to get my logs, will update once I can grab them. Reports from helium inc say seed nodes are back to normal, higher p2p activity on lower numbers seem to correlate, also absorption times went down 80-90%, and the logs are filling up much faster. +2 days ago, would reset at 00:00 utc, then yesterday at about 21:00utc today 20:30 UTC the log had filled itself, and reset. So we should see if this has any effect beyond taking up the seed node slack, now. 90% was successrate last time i ran the analyzer, at 89/99

having a problem generating the report: short details witnessed: 102 sent: 102 successful: 92 max retry: 10 same settings still running with peerbook miner peer book -c 256639, which is totally different from what we originally were tracking which was peer connections

current results 60/20/600000 - Total Witnessed: = 13 |-- Sending: = 13 |-- Times Retried: = 23 Successful: = 12 ( 92%) Resent: = 15 Max Retry: = 1 ( 7%) Other (Witness Failures): = -3 ( -23%)

Challenger Issues: |-- Challenger Not Found: = 8 |-- Challenger Timed Out: = 15 |-- Challenger Refused Connection: = 0 |-- Challenger Unreachable: = 0 |-- Challenger No Listening Address: = 0

Total Peer Activity: = 56 |-- Timeouts: = 21 ( 37%) |-- Proxy Session Timeouts: = 0 ( 0%) |-- Relay Session Timeouts: = 11 ( 19%) |-- Normal Exit: = 13 ( 23%) |-- Not Found: = 16 ( 28%) |-- Server Down: = 0 ( 0%) miner peer book -c 255741 Some improvements, should be attributed to the "fixed" seed nodes.

mikejobo commented 2 years ago

Hello PantherX2 100/75/900000 peer book size 167000 (24h after making the changes) success rate 82% On Wed, Mar 9, 2022 at 7:31 PM Ay0hCrypto @.> wrote: 85/35 >90% success <10% various fails (tested 4 hours, longer testing may alter results) 75/25 =88% success 12% various fails (10+ hours) 85/15 =73% success 6% max retry failure 21% various fails (12 hours) Things we noticed: #1 <#1> successful witnesses achieved after more than 1 failure, had 25% the chance of making it on chain as those with 1 or 0 failures before successfully sent to challenger. #2 <#2> higher numbers received more success until we got proxy errors, but didn't always correlate to higher earnings/more witnesses appearing on block. #3 <#3> even when higher numbers did produce higher gains, it created a wave pattern in activity, where slightly lower numbers showed more consistent results. #4 <#4> We tried lower 15/5 30/10 and higher 95/35 150/50, best results so far found between 75-90 in bound / 25-35 outbound. #5 <#5> the longer we tested any variable the less successful they became. except 75/25 which bounced between 72-76% success rate #6 <#6> two years ago the sys.config file looked like [image: image] https://user-images.githubusercontent.com/98350820/157497789-a4fc52d0-37f5-4c3c-b776-5a63a2337796.png — Reply to this email directly, view it on GitHub <#1407 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXJSHBF6BTN4CWBC4C6NMF3U7DN6XANCNFSM5NEOALNA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you commented.Message ID: @.>

Could you help me out with setting this up?

DimitrisSar commented 2 years ago

It'll take me a minute to get my logs,

Continuing from https://github.com/helium/miner/issues/1407#issuecomment-1065825583 New data set based on: 25 hours 180k/50/20 peer book -c: 210.820 eth0: TX: 14.0 GiB RX: 2.4 GiB

General Witnesses Overview

Total witnesses = 1308 (52.54/hour) Succesfully delivered = 942 (72.02%) Failed = 366 (27.98%) ├── Max retry = 363 (27.75%) └── Crash/reboot = 3 (0.23%)

Max Retry Failure Reasons

Timeout = 255 (19.5%) Not Found = 101 (7.72%) Other challenger issues = 7 (0.54%)

Challengers

Not Relayed = 624 (47.71%) Relayed = 336 (25.69%) Unknown (Probably Not Relayed) = 348 (26.61%)

The new setting (180k/50/20) gives worse results compared to previous trials (300k/100/50) Not_Found has increased from 2.88% to 7,72% of total Failed has increased from 24.34%to 27.98% (multiple timeouts of course)

I will go back to "turbo" 180k/200/50 for the next 24 hours to see a new set of results and compare I am sure that my TX volumes will increase enormously :)

@Ay0hCrypto

edit: only 41 rewarded witness reports (a joke) 🥇

Ay0hCrypto commented 2 years ago

since seed nodes, over came their issues. (temporarily) we've seen 100% success rates, single re-sends before success and about 50/50 timed_out to Not_Found.

Total Witnessed: = 13 |-- Sending: = 13 |-- Times Retried: = 31 Successful: = 11 ( 84%) Resent: = 13 Max Retry: = 2 ( 15%) Other (Witness Failures): = -2 ( -15%)

Challenger Issues: |-- Challenger Not Found: = 18 |-- Challenger Timed Out: = 13 60/20/60000 peer book -c 256042

DimitrisSar commented 2 years ago

First 5h33m of 180k/200/50:

General Witnesses Overview

Total witnesses = 324 (58.66/hour) Succesfully delivered = 235 (72.53%) Failed = 89 (27.47%) ├── Max retry = 84 (25.93%) └── Crash/reboot = 5 (1.54%)

Max Retry Failure Reasons

Timeout = 66 (20.37%) Not Found = 17 (5.25%) Other challenger issues = 1 (0.31%)

Challengers

Not Relayed = 153 (47.22%) Relayed = 97 (29.94%) Unknown (Probably Not Relayed) = 74 (22.84%)

eth0: TX 10.1 GiB, RX 619.9 MiB peer book -c: 195.319

some example entries:

Session Relay Status Fails Reason
0.4557.0 no successfully sent 1  
0.6157.0 no successfully sent 2  
0.7353.0 no successfully sent 2  
0.7642.0 no successfully sent 1  
0.7770.0 no successfully sent 1  
0.7795.0 no successfully sent 1  
0.7454.0 no failed max retry 10 not found
0.8189.0   successfully sent 0  
0.8137.0 yes successfully sent 2  
0.8363.0   successfully sent 0  
0.8398.0 no successfully sent 1  
0.8278.0 yes successfully sent 4  
0.8553.0 no successfully sent 1  
0.7773.0 yes failed max retry 10 timeout
0.8123.0 yes failed max retry 10 timeout
0.8164.0 yes failed max retry 10 timeout
0.9486.0   successfully sent 0  
0.9717.0   successfully sent 0  
0.9834.0 no successfully sent 1  
0.9416.0 yes failed max retry 10 timeout
0.9820.0 no failed max retry 10 connection refused

Of course, connection refused from this 0,003 Reward Scaled challenger :) https://explorer.helium.com/hotspots/11tnX4sG9VRt5MecTHHtzTH7AKMqDWfV7WDkdvcGYmVuRWBKWkZ lol

adrianformunda commented 2 years ago

Hello 24h with 85/35/600K Peer book size 187K On chain 53 (5.6%)

General Witnesses Overview

Total witnesses = 1364 Succesfully delivered = 952 (69.79%) Failed = 412 (30.21%) ├── Max retry = 407 (29.84%) └── Crash/reboot = 5 (0.37%)

Max Retry Failure Reasons

Timeout = 245 (17.96%) Not Found = 153 (11.22%) Other challenger issues = 8 (0.59%)

Challengers

Not Relayed = 982 (71.99%) Relayed = 246 (18.04%) Unknown Relay Status = 136 (9.97%)

adrianformunda commented 2 years ago

and 24h with 100/75/900K Peerbook size 133772

General Witnesses Overview ---------------------------------- Total witnesses = 1374 Succesfully delivered = 903 (65.72%) Failed = 471 (34.28%) ├── Max retry = 471 (34.28%) └── Crash/reboot = 0 (0%) Max Retry Failure Reasons ---------------------------------- Timeout = 410 (29.84%) Not Found = 54 (3.93%) Other challenger issues = 6 (0.44%) Challengers ---------------------------------- Not Relayed = 946 (68.85%) Relayed = 380 (27.66%) Unknown Relay Status = 48 (3.49%)

DimitrisSar commented 2 years ago

I give up on this P2P crap. I will stay on something "balanced" like {peerbook_update_interval, 600000}, {max_inbound_connections, 50}, {outbound_gossip_connections, 10}

and leave it at that.

they better implement HIP55 soon... this is horrible

Ay0hCrypto commented 2 years ago

I give up on this P2P crap. I will stay on something "balanced" like {peerbook_update_interval, 600000}, {max_inbound_connections, 50}, {outbound_gossip_connections, 10} they better implement HIP55 soon... this is horrible

We found no real difference between 60/20 all the way to 85/35. 3-5x 6/2 still gives a peer book in the 150K + range, with far less filling the logs, less peer activity etc. however we did find that the 60/20-80/35 range worked better when the network was having more problems. since we maintained earnings and activity while others were seeing almost no activity at all. we're testing 24/8/600000 and 30/10 600000. and will probably stay between 15/5-30/10 range.

JohnnyMirza commented 2 years ago

I've been running 100/40/90000 for about 24 hours and have seen errors drop, but also mining rewards have also dropped. Not sure if these settings are related and I am still yet to figure out what the log entries are for when the miner is rewarded.

/opt/miner # miner peer book -c 142970

-current date: Sun Mar 13 21:49:11 UTC 2022

`/opt/miner # /root/processlogs.php -p /var/data/log/

Using logs in folder /var/data/log/

General Witnesses Overview
----------------------------------
Total witnesses                   =  1323 (18.96/hour)
Succesfully delivered             =  1142 (86.32%)
Failed                            =   181 (13.68%)
  ├── Max retry    =  180 (13.61%)
  └── Crash/reboot =    1  (0.08%)

Max Retry Failure Reasons
----------------------------------
Timeout                           =   120  (9.07%)
Not Found                         =    49   (3.7%)
Other challenger issues           =    11  (0.83%)

Challengers
----------------------------------
Not Relayed                       =   755 (57.07%)
Relayed                           =   342 (25.85%)
Unknown (Probably Not Relayed)    =   226 (17.08%)`
`/opt/miner # /root/processlogs.php -s 2022-03-11 -e 2022-03-12 -p /var/data/log/

Using logs in folder /var/data/log/

General Witnesses Overview
----------------------------------
Total witnesses                   =   460 (19.18/hour)
Succesfully delivered             =   393 (85.43%)
Failed                            =    67 (14.57%)
  ├── Max retry    =   66 (14.35%)
  └── Crash/reboot =    1  (0.22%)

Max Retry Failure Reasons
----------------------------------
Timeout                           =    41  (8.91%)
Not Found                         =    19  (4.13%)
Other challenger issues           =     6   (1.3%)

Challengers
----------------------------------
Not Relayed                       =   254 (55.22%)
Relayed                           =   130 (28.26%)
Unknown (Probably Not Relayed)    =    76 (16.52%)`
adrianformunda commented 2 years ago

Hello

Currently I see a drastic drecrease in the successfully witnesses that get it on chain. It;s like 2-3%.

Is anyone else having the same situation?

Ay0hCrypto commented 2 years ago

so i finally went a little further and started adjusting the other numbers within the config file. The only thing that seemed to have any real effect was upping consensus group members to the actual size of 43. I ran that with 45/43 40/40 49/39 with 600000-900000. less retries, more successful on first try. approximately the same not_found errors, and a slightly smaller peerbook hanging around 198000-209000.

DimitrisSar commented 2 years ago

so i finally went a little further and started adjusting the other numbers within the config file. The only thing that seemed to have any real effect was upping consensus group members to the actual size of 43. I ran that with 45/43 40/40 49/39 with 600000-900000. less retries, more successful on first try. approximately the same not_found errors, and a slightly smaller peerbook hanging around 198000-209000.

My understanding is that a gossip peer can be any hotspot (not necessarily only the "master" seed nodes = 40 (or 43?)

I may be wrong here.

defaults: https://github.com/helium/miner/blob/master/config/sys.config

Ay0hCrypto commented 2 years ago

{num_consensus_members, 43},
this is what i altered this time. appearrs as \/ in the logs {max_inbound_connections,20},{port,0},{num_consensus_members,43}

Devome61 commented 2 years ago

so i finally went a little further and started adjusting the other numbers within the config file. The only thing that seemed to have any real effect was upping consensus group members to the actual size of 43. I ran that with 45/43 40/40 49/39 with 600000-900000. less retries, more successful on first try. approximately the same not_found errors, and a slightly smaller peerbook hanging around 198000-209000.

Thank you mate for this info.. So we need just change inbound outbound and consensus members number right? like below?

Screen Shot 2022-03-19 at 18 41 38
Ay0hCrypto commented 2 years ago

@Devome61 yes, IDK why my peerbook was so low that day though. lowered inbound and outbound to 35. Peer Book Size: = 264163 my peerbook has stayed above 250,000 for 2 days now.

Devome61 commented 2 years ago

Mine is around 195k now with 49 39.. Will wait one more day..Btw i guess around 200k is enough.. I'm sharing my stats between 19.03.22 18.30 lcl to 20.03.22 20:15 lcl(Around 24 hours) wiith consensus 43- inbound 49-outbound 39..I'll wait one more 24 hours to see if the failed % will decrease..My best was %25 failed with 75 25..

Screen Shot 2022-03-20 at 20 12 34

.

Ay0hCrypto commented 2 years ago

Total PoC: = 4 |-- Sending: = 4 |-- Times Retrie = 11 Successful: = 3 ( 75%) Unreachable: = 2 ( 20%) Max Retry: = 1 ( 25%) Peer Book Size: = 259026 only logged for about an hour as the p2p volume is filling the logs (1mb) in about 16 hours.

fraggy2k commented 2 years ago

After around 7 days with 80/20/600k my peerbook raised to 385k

Success rate 85-88% and only 2,7% not found

Devome61 commented 2 years ago

After around 7 days with 80/20/600k my peerbook raised to 385k

Success rate 85-88% and only 2,7% not found

Wow really nice,i can try this :) What about HDD storage and cpu usage? 385k is really high :)

Lasslabambele commented 2 years ago

@fraggy2k do you have the same for (num_consensus_members, 43) or do you have the normal value of 16 there?

fraggy2k commented 2 years ago

@fraggy2k do you have the same for (num_consensus_members, 43) or do you have the normal value of 16 there?

Haven't touched other variables only peerbook_update_interval, max_inbound_connections, outbound_gossip_connections

Today my 375k peerbook take 357M space, CPU load about 10%...

Devome61 commented 2 years ago

49/29 Consensus member : 43 From 19.03.22 18.30 lcl to 21.03.22 22:00 lcl..24 hours was %39.75 failed,in 51 hours %34.21 Peerbook : 181k.. Around +24 hours it was 200k..

Screen Shot 2022-03-21 at 22 09 15
florian-asche commented 2 years ago

I found this searching for config files:

In our Tuesday post, we spoke about an issue Nebra miners were experiencing whereby they were filling up their storage and causing inconsistent behaviour. Our development rolled out an update and that seems to have corrected that issue.

Due to our proactive monitoring of our fleet, we have noticed a drop in earnings, that was specific to our fleet. In our investigation we have noticed witness receipts not being reported to the blockchain due to p2p connection issues. We have identified the seed nodes being the issue. The recently added seed nodes haven't been configured in our custom sys.config, causing some peers to be unknown to our miners. We have pushed an urgent fix to our production and we expect to see immediate improvements in the next few days.

Maybe we should look into that...

Lasslabambele commented 2 years ago

@florian-asche

thanks for the info.

Short understanding question, how is it meant with our fleet, does the fix also apply to panther x etc or only a certain brand?

florian-asche commented 2 years ago

@florian-asche

thanks for the info.

Short understanding question, how is it meant with our fleet, does the fix also apply to panther x etc or only a certain brand?

only for nebra. But i had the idea that it could also apply for us if we do the same. But i didnt found any changes regarding a seeds.

e-estrange commented 2 years ago

Huge amount of failed witnesses (not found). However the peer book size pretty big: 280290. I didn't change the values in the system config file. Anyone else seen an increase in fails with their setup in the past... lets say week or so? image

ghost commented 2 years ago

Yeah, barely any getting through in the last 48 hours (20% succesful for me). I'm just giving up and waiting till May 3 when this problem is all behind us. Good to know it's not just me though.

marcijel1 commented 2 years ago

I'm using the Inigo Flores' settings. The fail rate went up from about 10% to 25-30% Both not found and time out failures are getting worse. Peerbook size is 255,000 I'm in an area with about 60-70 reachable hotspots.

e-estrange commented 2 years ago

I ran 600k/60/20 now for 24h and the values improved a lot: image

It also looks like that the percentage of timeout errors went up (percentage wise) and the not found went down.

mabizad commented 2 years ago

I've been testing higher values for those two sys.config parameters, and I confirm similar results, to the point where I have got completely rid of the not_found errors in all my miners.

$ ./processlogs.php -s 2022-02-09

Using logs in folder /home/pi/hnt/miner/log/

General Witnesses Overview
----------------------------------
Total witnesses                   =    77
Succesfully delivered             =    71 (92.21%)
Failed                            =     6  (7.79%)
  ├── Max retry    =    5  (6.49%)
  └── Crash/reboot =    1   (1.3%)

Max Retry Failure Reasons
----------------------------------
Timeout                           =     4    (80%)
Not Found                         =     0     (0%)
Other challenger issues           =     1    (20%)

Challengers
----------------------------------
Not Relayed                       =    74  (96.1%)
Relayed                           =     2   (2.6%)
Unknown Relay Status              =     1   (1.3%)`

Peer book size is huge:

$ sudo docker exec miner miner peer book -c
234053

I wonder if it can have a negative impact on the network if everyone implements these changes.

How to change them for sensecap. Any ssh commands?

mabizad commented 2 years ago

Hi guys, just configured my sensecap and below actual witness statistics with default values, I will let you know:

{max_inbound_connections, 100},
{outbound_gossip_connections, 25},

General Witnesses Overview

Total witnesses = 105 Succesfully delivered = 89 (84.76%) Failed = 16 (15.24%) ├── Max retry = 16 (15.24%) └── Crash/reboot = 0 (0%)

Max Retry Failure Reasons

Timeout = 7 (43.75%) Not Found = 9 (56.25%) Other challenger issues = 0 (0%)

I have a question, analyzing the miner logs with processlogs.php I see several operations that I can't full understand, here an example, last two rows in the -l output . The first is a witness rewarded because the challenger did a "challenged beaconer" activity, the last is a response non to a challenged beacon but to a "Constructed Challenge" . What's mean this kind of action?

2022-02-12 12:30:31.256 |   0.8687.0 | -112 | 867.1 |   1.5 | 11rtLkhrw1NrPtvmbWEt6BWUEjNLVmvjB4hojC8dDfjfhJUCGds  | no    | successfully sent |     1 |
2022-02-12 13:34:38.698 |  0.15039.0 | -115 | 867.3 |  -2.5 | 11b9km9H4aVTS5S1ZPFKRzvcb13x3Dru5LoEPeHnVRgoLhT4PHc  | no    | successfully sent |     1 |

How did you do it? Does it have a positive impact?

LDarnton commented 2 years ago

I've been testing higher values for those two sys.config parameters, and I confirm similar results, to the point where I have got completely rid of the not_found errors in all my miners.

$ ./processlogs.php -s 2022-02-09

Using logs in folder /home/pi/hnt/miner/log/

General Witnesses Overview
----------------------------------
Total witnesses                   =    77
Succesfully delivered             =    71 (92.21%)
Failed                            =     6  (7.79%)
  ├── Max retry    =    5  (6.49%)
  └── Crash/reboot =    1   (1.3%)

Max Retry Failure Reasons
----------------------------------
Timeout                           =     4    (80%)
Not Found                         =     0     (0%)
Other challenger issues           =     1    (20%)

Challengers
----------------------------------
Not Relayed                       =    74  (96.1%)
Relayed                           =     2   (2.6%)
Unknown Relay Status              =     1   (1.3%)`

Peer book size is huge:

$ sudo docker exec miner miner peer book -c
234053

I wonder if it can have a negative impact on the network if everyone implements these changes.

How to change them for sensecap. Any ssh commands?

1) https://www.youtube.com/watch?v=HaalTIOCxG0

2) https://www.youtube.com/watch?v=j50eIzf1Lgg

mabizad commented 2 years ago

I've been testing higher values for those two sys.config parameters, and I confirm similar results, to the point where I have got completely rid of the not_found errors in all my miners.

$ ./processlogs.php -s 2022-02-09

Using logs in folder /home/pi/hnt/miner/log/

General Witnesses Overview
----------------------------------
Total witnesses                   =    77
Succesfully delivered             =    71 (92.21%)
Failed                            =     6  (7.79%)
  ├── Max retry    =    5  (6.49%)
  └── Crash/reboot =    1   (1.3%)

Max Retry Failure Reasons
----------------------------------
Timeout                           =     4    (80%)
Not Found                         =     0     (0%)
Other challenger issues           =     1    (20%)

Challengers
----------------------------------
Not Relayed                       =    74  (96.1%)
Relayed                           =     2   (2.6%)
Unknown Relay Status              =     1   (1.3%)`

Peer book size is huge:

$ sudo docker exec miner miner peer book -c
234053

I wonder if it can have a negative impact on the network if everyone implements these changes.

How to change them for sensecap. Any ssh commands?

1) https://www.youtube.com/watch?v=HaalTIOCxG0

2) https://www.youtube.com/watch?v=j50eIzf1Lgg

Thank you will try them

Ay0hCrypto commented 2 years ago

Helium adjusted the seed nodes and peer connections to 6/6 only 3 manufacturers announced issuing the update, and it wont matter after light hotspots. Adjusting the Rocks DB variables further down the sys.config. and using private seed nodes worked better during p2p network stress. but all irrelevant in 10 days

DimitrisSar commented 2 years ago

Adjusting the Rocks DB variables further down the sys.config.

db variables for the hotspot or your seed node? can you share the settings please? :)

Ay0hCrypto commented 2 years ago

RockDB edits {rocksdb, [{global_opts, [ {max_open_files, 1024}, {compaction_style, universal}, {memtable_memory_budget, 134217728}, % 128MB {arena_block_size, 1048576}, % 1MB {write_buffer_size, 1048576}, % 1MB {db_write_buffer_size, 67108864}, % 64MB {max_write_buffer_number, 40}, {keep_log_file_num, 5}, {max_log_file_size, 1048576}, %% keep log files 1mb or less {log_file_time_to_roll, 86400} %% rotate logs once a day ]}

wolasss commented 2 years ago

For all of those who modified sys.config on your sensecaps: I noticed that with light hotspots update sys.config did not update along, how did you cope with that?

masterconqueror commented 2 years ago

For all of those who modified sys.config on your sensecaps: I noticed that with light hotspots update sys.config did not update along, how did you cope with that?

my two device and my friends devices had same issue. so i re-flashed sd cards.

meowshka commented 2 years ago

@wolasss I'm sorry it took a while to get back to you. Please reach out to the manufacturer of your hotspot to reset it to its default OEM settings, so you can continue receiving latest firmware updates.

wolasss commented 2 years ago

@wolasss I'm sorry it took a while to get back to you. Please reach out to the manufacturer of your hotspot to reset it to its default OEM settings, so you can continue receiving latest firmware updates.

For anyone else wondering, there is a way to do it without flashing a SD card if you don't have physical access to the device.

  1. Check what is the version of the miner that is currently running (ps auxf | grep /opt/miner/releases)
  2. Go to https://github.com/helium/miner/releases/tag/[YOUR VERSION TAG] and download that release
  3. Replace your modified sys.config with the original (config/sys.config)from that particular release
  4. Reboot the miner
  5. Miner should automatically update to the latest release

At least this worked for me