GeyserMC / Geyser

A bridge/proxy allowing you to connect to Minecraft: Java Edition servers with Minecraft: Bedrock Edition.
https://geysermc.org
MIT License
4.72k stars 678 forks source link

Extreme CPU Overload due incomplete I/O operations #4602

Closed ByteExceptionM closed 6 months ago

ByteExceptionM commented 6 months ago

Describe the bug

Since I update from build 478 to the latest build 512 for 1.20.80 support, I have had extreme CPU peaks. Diff: https://github.com/GeyserMC/Geyser/compare/b469904951bfa38ad5e950d58bb3e0dcbe2b9d73...2471de100be3f229bfa415ec887e48a29002d2b2

To Reproduce

Update to latest build

Expected behaviour

No CPU overload

Screenshots / Videos

image image image image

Server Version and Plugins

No response

Geyser Dump

No response

Geyser Version

2.2.3-SNAPSHOT 2471de1

Minecraft: Bedrock Edition Device/Version

No response

Additional Context

No response

onebeastchris commented 6 months ago

Please send a spark report - that should show what Geyser is doing. Here's how:

Spark is a plugin that helps you monitor performance for you server. https://spark.lucko.me/download

To record performance on your server use: /spark profiler --thread * --timeout 60. This will run for 60 seconds then it will automatically stop. It'll probably lag the server a good deal but it'll give us a link we might be able to process.

ByteExceptionM commented 6 months ago

Geyser runs standalone, cannot start a spark profiler here

onebeastchris commented 6 months ago

If you're able to compile it yourself, we do have a spark geyser extension that you could use: https://github.com/GeyserMC/spark If you're not able to compile it, I could send a build of it in a few hours

ByteExceptionM commented 6 months ago

All right. I'll take care of it

ByteExceptionM commented 6 months ago

I now have the Spark extension on the Geyser application. However, I cannot execute the command because it is not found. The following message appears in the logs: image But also this one: image

I have also installed Spark on all sub-servers. The command for Geyser Spark is also spark. In other words, I can't really run it now. I can't run anything in the console either (Docker container without the correct attach & interact flags). I can't restart the Geyser instances either, as there are currently a lot of players online. Able to debug it by urself?

ByteExceptionM commented 6 months ago

iotop: image

Geyser is running in a docker container. Path is /data/server.jar

As you can see here in the screenshot, Geyser blocks many processes with IO operations that are not closed. This leads to high CPU utilization.

onebeastchris commented 6 months ago

I'm unable to try and replicate the issue at the moment, but I will try and fix the spark extension so that could be used to get proper data to resolve this issue

onebeastchris commented 6 months ago

https://github.com/GeyserMC/spark/pull/1 This should resolve the issue with spark not working. Here's a working build: spark-1.10.0-geyser.zip

Does this issue occur at some specific playercount? In any case, without some concrete data on what's causing the high usage uptick it'll be difficult to guess what the issue is caused by.

ByteExceptionM commented 6 months ago

Sounds great. Already checked out your branch and deploying it to start a profiler.

In any case, the CPU load increases with the number of players - it doesn't go higher than 90%. The number of processes waiting for I/O doesn't change much, either. Enclosed screenshots

image image

Kas-tle commented 6 months ago

At this point it would help quite a bit if you could isolate the issue to a certain commit as the range you've provided is quite a bit to go through, especially given we cannot reproduce the issue due to your complex setup.

ByteExceptionM commented 6 months ago

I am already debugging with Chris in dm. It's not really possible for me to search through your code or check what the problem is. The Geyser Spark extension had some problems - which Chris has now fixed. I currently have the Geyser traffic routed to another machine so I can make the changes and debugs there. We are a network with almost 5,000 different players - unfortunately it's not that easy with restarts. As soon i get more information, ill come back. Im on it!

ByteExceptionM commented 6 months ago

The error has not occurred again to date. Could not trace the source of the error. If it occurs again, I'll be sure to profile it properly with the fixed spark extension. I will reopen the issue, when it occurs again - but close the issue here at this point.

nicolube commented 1 month ago

We're heaving the same behavior...

grafik

We're currently running on an Advance-3 Gen 2 from OVH. HW Specs:

CPU

RAM

Storage:

We're running the latest geyser-version, it will get updated every day at 4am.

We're running ur geyser standalone and has an resourcepack. Everything else is basically vanilla.

And just added geyser-spark to it.

onebeastchris commented 1 month ago

@nicolube please open a new issue instead of commenting here. Further, please attach further information, such as a spark profiler run and similar - server specs alone and rather vague screenshots are unfortunately not particularly helpful to debug the issue. Thanks!

nicolube commented 1 month ago

@onebeastchris Hello, I opened a new issue, and attacked a spark profile, I just did not have one with load this morning. https://github.com/GeyserMC/Geyser/issues/5050