games647 / FastLogin

Checks if a minecraft player has a valid paid account. If so, they can skip offline authentication automatically. (premium auto login)
https://www.spigotmc.org/resources/fastlogin.14153
MIT License
494 stars 121 forks source link

Connection blocking due to too many requests #347

Closed Manu5656 closed 2 years ago

Manu5656 commented 4 years ago

Hi, I have a big problem with FastLogin. In this period my server is receiving numerous bot attacks and in these moments, when the bots try to enter FastLogin starts sending all the requests to the mojang (so I understand) and to the sql database, this causes a block of the server and lag. The server is no longer reachable and no user can enter. This also managed to cause me a whole block of the server machine given the numerous requests to the sql database.

Is there a way to solve this problem? If it is not clear I use FastLogin through Bungeecord (obviously it is also installed on spigot servers)

Manu5656 commented 4 years ago

In addition, a spam of these messages is sent to the console, which refers to the problem mentioned https://hastebin.com/idopemaquh.md

games647 commented 4 years ago

HikariDataSource (FastLogin) has been closed.

This means that the plugin shutdown. Is it possible to search for the first error?

Manu5656 commented 4 years ago

It seems to me that the first error are these https://hastebin.com/iroyebiqas.cs

games647 commented 4 years ago

Do you also see the first error about Hikari?

Manu5656 commented 4 years ago

What's this Hikari?

games647 commented 4 years ago

It's a database connection pooling library. It was closed. The reason for it should be in the server log.

Manu5656 commented 4 years ago

I searched for that word and found nothing about it. I was trying to paste the entire log file on hastebin but being very large it is taking a long time to save it

games647 commented 4 years ago

It should, because the first log you posted contains it.

Manu5656 commented 4 years ago

Where can I send you the log file? Do you have discord? So check

Manu5656 commented 4 years ago

Let me know

Manu5656 commented 4 years ago

Here is the first mistake, the one you asked me for. I managed to find it: https://hastebin.com/kutakazuwa.pl

I hope you know how to solve because even at this moment the server is blocked again due to the bot attack that is causing too many requests to be sent to the database with FastLogin and consequently lags everything

Manu5656 commented 4 years ago

https://hastebin.com/hixifenema.xml

This is what happens now when trying to restart Bungeecord. This causes me to block connection to the server machine for more than 10 minutes

Manu5656 commented 4 years ago

https://hastebin.com/nayiyuyele.pl

Can you solve this problem? It causes me to block all the other databases and the machine itself, it happened again today. It really creates a lot of problems for me. Can you solve it? Thanks in advance

Manu5656 commented 4 years ago

@games647

Manu5656 commented 4 years ago

I am using this version on bungeecord 1.11-SNAPSHOT-f40e787 I don't know if in the latest version you solved the problem

games647 commented 4 years ago

Preventing bot attacks is not really in scope of this project. It's best to drop them before they reach the Minecraft server part. Another solution is to install an AntiBot plugin that will drop connections if there are too many. The performance impact would be higher than something low level like iptables or ufw. However could already be enough.

The error messages you posted show, that your MySQL/MariaDB server is overloaded. Therefore you should check the CPU usage. Is your database server running on the same host? Then it could be affected by a high load of the Minecraft server handling the connections. Do you verified the load? If you identified Java, you can now attach a Profiler and identify the load there.

Finding a good solution requires time. There is a lot of other stuff to do besides developing a free Minecraft plugin. Sorry.

Nevertheless I added a very basic rate limiter, which will ignore incoming connections if there are too many.

Manu5656 commented 4 years ago

I have already installed a good anti bot plugin. MySql is overloaded for the number requested by the plugin, however I thought of a hypothetical solution. Soon I will try to implement it hoping that it will be fine. As for the limiter you added, is it available in the latest version? So I unload and install it in place of the current one

games647 commented 4 years ago

I have already installed a good anti bot plugin. MySql is overloaded for the number requested by the plugin, however I thought of a hypothetical solution

Well FastLogin still sees the incoming connection. Anti bot plugins should prevent connections before plugins could see them. As pointed out in #304, BungeeCord has a design flaw for anti-bot plugins that perform asynchronous checks (synchronous non-blocking checks are fine). Other plugins won't see the result except they dependent on their API directly, because either they running in parallel (both async) or the check runs in the background. You also cannot perform any requests synchronous, because that would block the BungeeCord event loop.

Nevertheless I found that Waterfall a fork of BungeeCord has an extra event. This event fires before the PreLoginEvent and therefore could be used for asynchronous bot checking without the mentioned downside from above. Maybe that could useful for your implementation.

Furthermore I don't know if performing an an extra network request for every joining player in order to check if the IP is blacklisted like many anti bot plugins do, is really useful. It would require a good connection pooling implementation, because connection initialization could be very expensive and has some overhead in performing the request too. The optimal solution would be to different behavior depending on the incoming load. An order like this: Low load to higher load

  1. Check the IP for being blacklisted - Too much overhead for more connections
  2. Ask a local database to allow joining of existing players and block new players -> load on the database, but could be lighter than 1)
  3. Activate online mode
  4. Whitelist against a in-memory player name list
  5. Block all incoming requests

So I unload and install it in place of the current one

Yes

games647 commented 4 years ago

I have a feeling that even many paid plugins have this misunderstanding.

Manu5656 commented 4 years ago

right now at the bungeecord post I use FlameCord which is even better than Waterfall on some points. Anyway okay, thank you for your availability. Later I will try to install the latest version of FastLogin and make changes that I thought of, which, I hope, should solve the problem. Eventually I'll let you know

Manu5656 commented 4 years ago

Even using the version of the plugin where you said you entered the option that ignores incoming requests when there are too many, I always have the same problem https://pastebin.com/GNGeTQLw

games647 commented 4 years ago

You can always adjust the rate limit in the config. https://github.com/games647/FastLogin/blob/103a8320ecd60f9fae34803c0c25a29043b1df97/core/src/main/resources/config.yml#L15

Furthermore please check the usage (CPU, RAM, Query Performance, ...) of the database server to verify that it's really crashing, refusing the connection or just being overloaded.

Manu5656 commented 4 years ago

What exactly does that number indicate? How many connections can there be before I start ignoring them? What number do you advise me to set?

While "expire" indicates after how long (in minutes, seconds) the premium check is reactivated?

The main problem should be database overload, which occurs due to the many requests at the same time

games647 commented 4 years ago

What exactly does that number indicate? How many connections can there be before I start ignoring them? What number do you advise me to set?

You are right, I'm going to add a better description. For now you can image it like a bucket. Connections are the maximum amount that can be filled. While expire means how long it takes in minutes for every entry to be removed.

The main problem should be database overload, which occurs due to the many requests at the same time

FastLogin have to ask the database if any data is available and then on successful force login to save the data. How do you think the saved data is retrieved otherwise? Even caching doesn't work here, because those names will be always new players.

For the database FastLogin is already leveraging query caching a lot if possible in order to reduce the load on parsing the query. Retrieving the data also always uses LIMIT where possible. If you disable nameChangeCheck only a single query is used to retrieve data. Furthermore a unique index is used on the username column to improve fetching time further.

FastLogin isn't likely the only plugin hitting your database. Nevertheless I could also add an option to reduce the pool size of the database connector. That would reduce the amount of work that hits the server concurrently. Current value is the HikariCP default 10. For this approach your core count as well as the CPU usage of the database server would be useful.

Manu5656 commented 4 years ago

I just noticed that the "AntiBot" entry is not present in my FastLogin config. When I updated the version I didn't know I had to clear the config to get it generated again, now I will. Probably, not being present in the config, that option was not active yet, I think.

Furthermore, I installed FastLogin on both the Bungeecord server and the lobby server both connected to the same database of course. Do I have to make changes in the lobby one too? Even if the main problem is on bungeecord.

As for the database, on the same machine I have also installed other databases but they never give me problems. The car in question is also quite powerful so I shouldn't have any performance issues. The only time I have problems is when this thing I told you about happens.

As I said, when this happens all the other databases and the entire server machine also crash (and I can't connect to it for a few minutes, until it recovers).

games647 commented 4 years ago

BTW: Could you post your configuration.

I just noticed that the "AntiBot" entry is not present in my FastLogin config. When I updated the version I didn't know I had to clear the config to get it generated again, now I will. Probably, not being present in the config, that option was not active yet, I think.

No it then picks up the default.

Furthermore, I installed FastLogin on both the Bungeecord server and the lobby server both connected to the same database of course.

The database connection is deactivated on Spigot if you use BungeeCord.

As for the database, on the same machine I have also installed other databases but they never give me problems. The car in question is also quite powerful so I shouldn't have any performance issues. The only time I have problems is when this thing I told you about happens.

The error means that the database didn't respond for 30 seconds. This could be caused by an overloaded system.

As I said, when this happens all the other databases and the entire server machine also crash (and I can't connect to it for a few minutes, until it recovers).

If your database is on the same system, then an high usage for other system also effects the database. If your CPU (example, you could also saturate memory, io bandwidth or others) is already saturated with other workload, the database cannot process the database queries. So in this case, don't assume that there too many requests going to the database. This high saturation for example be caused by too many spawned threads. Thread creation and context switch can be expensive in some situations.

So my recommendation is to definitely to some benchmark tests and observe system usages (CPU, memory, etc.). Then check the source of the high usages. Is it your database, a specific Java process or something else? For Java you could attach a Profiler (like Java Mission Control, VisualVM, etc.) to it to analyze it deeper. Is it a specific Thread or sum of specific threads. For example is another plugin doing HTTP calls for every single requests and that caused your system to go down under load.

Manu5656 commented 4 years ago

So FastLogin which is in the lobby server should I disconnect it from the database? I currently have the one in the lobby and bungecoord connected on the same database.

In the end I am attaching the current configuration of both bungeecord and spigot (server lobby) of the plugin.

As far as performance is concerned, this problem exists only when there are bot attacks and many requests are made via FastLogin. At other times there are never these problems.

Even in the past, when I received bot attacks and I didn't have FastLogin I never had this type of problem so I doubt there could be other causes.

Below I attach the configuration of FastLogin on Bungeecoord and Spigot (this is the one not updated with the new option, as I said before I should have the file generated again to get it).

Bungeecord: https://hastebin.com/evadakuhic.makefile

Spigot (Lobby): https://hastebin.com/ojuyabataf.makefile

Let me know if I have to change something and if I have to remove the database connection from the plugin in the spigot server (lobby)

games647 commented 4 years ago

So FastLogin which is in the lobby server should I disconnect it from the database? I currently have the one in the lobby and bungecoord connected on the same database.

There is no need to. It's already off. FastLogin will all the work BungeeCord and forward the data down using plugin messages.

As far as performance is concerned, this problem exists only when there are bot attacks and many requests are made via FastLogin. At other times there are never these problems.

That's why you should benchmark it... Do these tests to reproduce it and observe the situation under a controlled environment.

In the end I am attaching the current configuration of both bungeecord and spigot (server lobby) of the plugin.

You could disable nameChangeCheck. This makes a Mojang request even if you disabled autoRegister.

Manu5656 commented 4 years ago

If disabled nameChangeCheck, does something change for users? And with this thing disabled there should be a minor load right?

I suppose this option nameChangeCheck is used to check if a premium user has changed nickname, if so, so if it is disabled and a player changes nickname, is it not updated on the server?

games647 commented 4 years ago

If disabled nameChangeCheck, does something change for users? And with this thing disabled there should be a minor load right?

It needs to create HTTP call for every user. That could cause a higher load.

I suppose this option nameChangeCheck is used to check if a premium user has changed nickname, if so, so if it is disabled and a player changes nickname, is it not updated on the server?

It asks Mojang for the UUID and then checks if the UUID is already in the database. Then it assumes that player is same person and requests an onlinemode connection.

Manu5656 commented 4 years ago

So if this option is disabled and a registered premium user changes his nickname, will he still be able to enter the server while always keeping his uuid?

games647 commented 4 years ago

The person needs to activate the premium command again and then re-login, because this decision needs to be done on some factor.

Manu5656 commented 4 years ago

Doing the premium command will return to its premium uuid, right?

games647 commented 4 years ago

yes

Manu5656 commented 4 years ago

Okay, I'll try to disable this option hoping it's better