monero-project / monero

Monero: the secure, private, untraceable cryptocurrency
https://getmonero.org
Other
8.93k stars 3.1k forks source link

Tons of connections, monerod ignoring the cap #8367

Open asheroto opened 2 years ago

asheroto commented 2 years ago

Hello,

As of the current version, 0.17.3.2, for some reason monerod is allowing way too many connections to port 18080. There were 118 connections before my server stopped responding to json_rpc requests. I have a cap set at 25 but for some reason monerod doesn't comply with it anymore. The previous version didn't have this issue, the only thing I've changed is the version number (the binary) with the latest.

Running on Debian 10

Thanks

selsta commented 2 years ago

What cap did you set at 25?

Also are these connections on port 18080 (P2P) or 18081 (RPC)? Is your port 18081 open?

asheroto commented 2 years ago
out-peers=25
in-peers=25

18080 is the port with a ton of connections (P2P)

I have port 18089 open for RPC

selsta commented 2 years ago

From what I can tell there is nothing relevant to your issue changed between v0.17.3.0 and v0.17.3.2. Maybe you simply didn't look for it with the old version?

Regarding json rpc being unresponsive, that is a separate issue that is related to SSL.

You can either set --rpc-ssl to disabled or you can compile with #7760

asheroto commented 2 years ago

I UptimeRobot.com to monitor my server. It's basically an HTTP monitor that checks for connectivity every 5 minutes. What's weird is that I never received any "server down" / unresponsive events until upgrading to the latest version. Before this version my server had been running for around 3 months on the previous version without any events. Nowadays I'm getting an alert every few days. When I log on and check what's happening, there are over a hundred connections to port 18080, whereas before it never seemed to peak above the in/out-peers setting, but maybe that's different?

Anyway, the unresponsive issue seems to be correlated to the number of connections, as if the server can't handle all of them. If I restart the service everything is fine, and if there are even 50 connections it's fine, but it seems when it gets higher it can cause an unresponsive event.

Any ideas?

selsta commented 2 years ago

What exactly is UptimeRobot.com checking? RPC availability?

asheroto commented 2 years ago

Basically. It's just a free service that can check to see if a website (web server) is down, or ping checks, or port checks.

https://uptimerobot.com/website-monitoring/

To clarify, it can monitor HTTP/websites/web servers, however, I actually have this monitor set up to check port 18080 for connectivity every 5 minutes. So when the alert occurs, port 18080 is inaccessible. At that time, if I try to connect an XMR client (like the XMR GUI) to the node, the connection fails - port 18080 appears to be inaccessible. If I restart the monerod service, all is well again. The issue seems to occur when there are too many connections.

I reviewed the alerts just now. I've had the monitor set up since Feb 2022. There were some alerts in the past few months, but only every few weeks. I upgraded to the latest version of monerod about two weeks ago, and since then, I've had a connectivity issue/alert about every 3-5 days. Upon logging in, I'll find 100+ connections to port 18080.

UptimeRobot.com: image

I changed the in-peers and out-peers to 24 earlier, and it's obeying at the moment: image

But earlier, it was up at 50+

asheroto commented 2 years ago

Oddly enough it's behaving the last few days, capping at 24 as expected. What's weird is that literally the only thing changed on the entire server was the in/out peers from 25 to 24, which I can't imagine would be the reason for its behavior now.... kinda odd.

Wondering if there's some what in which the cap is ignored because of a bug?

selsta commented 2 years ago

It's difficult to tell. I think there are different issues at play here. RPC being unresponsive is related to a bug fixed in #7760, as I wrote in my previous comment. An alternative to this patch is to run monerod with --rpc-ssl disabled.

The issue with more connections than set in the limit, I'm not sure. Maybe some connections didn't close correctly for some reason.

asheroto commented 2 years ago

It sounds like that disables the SSL on RPC, right?

selsta commented 2 years ago

Yes, because the current SSL code for RPC has bugs.