PaperMC / Paper

The most widely used, high performance Minecraft server that aims to fix gameplay and mechanics inconsistencies
https://papermc.io/
Other
9.89k stars 2.29k forks source link

ServerLoginPacketListenerImpl uses Executors.newCachedThreadPool with unbounded queue #8797

Open egg82 opened 1 year ago

egg82 commented 1 year ago

Timings or Profile link

N/A

Description of issue

Discussed internally- no timings link or plugin list. This is an issue created so I can reference it in a near-future PR.

Commit 0ed4b91 reverts ThreadPoolExecutor to Executors#newCachedThreadPool which was changed to solve an issue of hanging logins. The problem with both, as discussed, is that both have the unfortunate side-effect of using one thread in the pool when combined with an unbounded queue. In the case of ServerLoginPacketListenerImpl this means that ALL logins are sent to the same thread regardless of how many threads the pool has available.

Discussions lead to a few different solutions, but after looking at the benefits and drawbacks of each solution the easiest fix is to use a fixed thread pool (a new ThreadExecutor where the core pool size is equal to the maximum pool size).

Other resolutions discussed were using a bounded queue and a "scaling thread pool" but ultimately the overhead for a fixed thread pool is minimal, is the easiest to maintain, and is likely the best solution for an authentication pool which is expected to live for the lifetime of the server anyway.

Further, discussions of a bounded vs unbounded queue led us to consider potential attack vectors. A bounded queue for the pool would see a successful attack by way of flooding the authentication service with fake logins and an unbounded pool would see a successful attack by way of flooding the authentication service with fake logins. The former would lead to RejectedExecutionExeptions and the latter would fill the heap until the server ran out of memory.

After deliberation, it was decided that a bounded queue would have less impact on the server despite a successful L7 attack on the authentication service. There are other solutions to L7 attacks on Minecraft servers and Paper's due diligence on attacks is best-effort while still keeping the original goals of performance, stability, and bugfixes in mind. Rejected logins are a better failure case than a crashed server even though the latter is more difficult.

A semi-arbitrary number of 1024 for the queue size seems reasonable.

Plugin and Datapack List

N/A

Server config files

N/A

Paper version

1.19.3 380, 0ed4b91

Other

No response

sunmisc commented 1 day ago

Timings or Profile link

N/A

Description of issue

Discussed internally- no timings link or plugin list. This is an issue created so I can reference it in a near-future PR.

Commit 0ed4b91 reverts ThreadPoolExecutor to Executors#newCachedThreadPool which was changed to solve an issue of hanging logins. The problem with both, as discussed, is that both have the unfortunate side-effect of using one thread in the pool when combined with an unbounded queue. In the case of ServerLoginPacketListenerImpl this means that ALL logins are sent to the same thread regardless of how many threads the pool has available.

Discussions lead to a few different solutions, but after looking at the benefits and drawbacks of each solution the easiest fix is to use a fixed thread pool (a new ThreadExecutor where the core pool size is equal to the maximum pool size).

Other resolutions discussed were using a bounded queue and a "scaling thread pool" but ultimately the overhead for a fixed thread pool is minimal, is the easiest to maintain, and is likely the best solution for an authentication pool which is expected to live for the lifetime of the server anyway.

Further, discussions of a bounded vs unbounded queue led us to consider potential attack vectors. A bounded queue for the pool would see a successful attack by way of flooding the authentication service with fake logins and an unbounded pool would see a successful attack by way of flooding the authentication service with fake logins. The former would lead to RejectedExecutionExeptions and the latter would fill the heap until the server ran out of memory.

After deliberation, it was decided that a bounded queue would have less impact on the server despite a successful L7 attack on the authentication service. There are other solutions to L7 attacks on Minecraft servers and Paper's due diligence on attacks is best-effort while still keeping the original goals of performance, stability, and bugfixes in mind. Rejected logins are a better failure case than a crashed server even though the latter is more difficult.

A semi-arbitrary number of 1024 for the queue size seems reasonable.

Plugin and Datapack List

N/A

Server config files

N/A

Paper version

1.19.3 380, 0ed4b91

Other

No response

I still don't understand why people don't switch to virtual threads en masse, while the minimum version of jdk is 21. This could solve a lot of problems with blocking I/O. The changes are very simple. Replace one Executor with another. That's it. Of course, virtual threads have their drawbacks, for example, a shallow stack. But if you spin in a spin loop, you take the thread from the pool anyway (no matter which one). Therefore, for blocking I/O, they are ideal. Configuring pool sizes for I/O, etc. is a collective farm IMHO.

I would not really like to see in the profiler many OS threads that are just parked most of the time. At the same time, they hang in the OS scheduler, and eat up memory. This is especially noticeable on small and weak servers.