PaperMC / Folia

Fork of Paper which adds regionised multithreading to the dedicated server.
GNU General Public License v3.0
3.54k stars 449 forks source link

Memory leak on Folia 1.21-2b8c879 #261

Closed MineSunshineone closed 1 month ago

MineSunshineone commented 2 months ago

Stack trace

paste your stack trace or a paste.gg link here!

Another Fork https://spark.lucko.me/SnZ2v6VqBq (Run 13h) Folia https://spark.lucko.me/0IPMgOFZdp (Run 10m) Folia https://spark.lucko.me/tPQbINjmur (Run 20m) Folia https://spark.lucko.me/v8nIIqWVGB (Run 30m)

Plugin and Datapack List

plugins [11:42:20 INFO]: Server Plugins (36): [11:42:20 INFO]: Paper Plugins: [11:42:20 INFO]: - MiraiMC [11:42:20 INFO]: Bukkit Plugins: [11:42:20 INFO]: - ajLeaderboards, AntiPopup, ArmorStandEditor, AxInventoryRestore, BetterGUI, ChestProtect, Chunky, CommandWhitelist, CoreProtect, CreeperConfetti [11:42:20 INFO]: DeathMessage, Essentials, GSit, ICTTools, InvSeePlusPlus, Kaiivoid, KissKiss, LuckPerms, LushRewards, Matrix [11:42:20 INFO]: MinePay, NoFlightInEnd, PlaceholderAPI, ProtocolLib, SkullPlugin, spark, TAB-Bridge, ToolStats, TrChat, UseTranslatedNames [11:42:20 INFO]: voicechat, Whitelist4QQ, WorldEdit, ZAutoBroadcast, ZeroshiftFcmd

Actions to reproduce (if known)

Run Folia 1.21-2b8c879

Folia version

version [11:44:06 INFO]: This server is running Folia version 1.21-DEV-dev/1.21@2b8c879 (2024-08-05T03:04:53Z) (Implementing API version 1.21-R0.1-SNAPSHOT) You are running the latest version Previous version: 1.21-DEV-2541ddf (MC: 1.21)

Other

Memory leaks occur approximately every 10 minutes on Folia 1.21-2b8c879 At first I used a Fork of Folia, and the memory leaked. I tried switching to Folia, and the same problem still occurred.

RitaSister commented 2 months ago

I can confirm this too, because this is very strange vioalitons old object. My spark: https://spark.lucko.me/yxaIfmfKol

RitaSister commented 2 months ago

heap summary too: https://spark.lucko.me/V7xqMTzAZQ

MineSunshineone commented 2 months ago

image maybe cause by spark

VaultSmallBoy commented 2 months ago

Stack trace

paste your stack trace or a paste.gg link here!

Another Fork https://spark.lucko.me/SnZ2v6VqBq (Run 13h) Folia https://spark.lucko.me/0IPMgOFZdp (Run 10m) Folia https://spark.lucko.me/tPQbINjmur (Run 20m) Folia https://spark.lucko.me/v8nIIqWVGB (Run 30m)

Plugin and Datapack List

plugins [11:42:20 INFO]: Server Plugins (36): [11:42:20 INFO]: Paper Plugins: [11:42:20 INFO]: - MiraiMC [11:42:20 INFO]: Bukkit Plugins: [11:42:20 INFO]: - ajLeaderboards, AntiPopup, ArmorStandEditor, AxInventoryRestore, BetterGUI, ChestProtect, Chunky, CommandWhitelist, CoreProtect, CreeperConfetti [11:42:20 INFO]: DeathMessage, Essentials, GSit, ICTTools, InvSeePlusPlus, Kaiivoid, KissKiss, LuckPerms, LushRewards, Matrix [11:42:20 INFO]: MinePay, NoFlightInEnd, PlaceholderAPI, ProtocolLib, SkullPlugin, spark, TAB-Bridge, ToolStats, TrChat, UseTranslatedNames [11:42:20 INFO]: voicechat, Whitelist4QQ, WorldEdit, ZAutoBroadcast, ZeroshiftFcmd

Actions to reproduce (if known)

Run Folia 1.21-2b8c879

Folia version

version [11:44:06 INFO]: This server is running Folia version 1.21-DEV-dev/1.21@2b8c879 (2024-08-05T03:04:53Z) (Implementing API version 1.21-R0.1-SNAPSHOT) You are running the latest version Previous version: 1.21-DEV-2541ddf (MC: 1.21)

Other

Memory leaks occur approximately every 10 minutes on Folia 1.21-2b8c879 At first I used a Fork of Folia, and the memory leaked. I tried switching to Folia, and the same problem still occurred.

I played in this server, and I would like to provide more information on the player side.

As we observed, The server will need to run for more than six(6) hours to present a sudden TPS drop (as low as <1.5). For players in the server, the TPS would drop to 1~2 for few seconds, then backs to normal(~19.5 TPS) for around 15 seconds, and repeats the malfunction. As about 15 minutes pass, the frequency of this malfunction becomes higher and the normal TPS time shortens to about 5 seconds. During this time, the Server Health Report (runned by command /tps) showed ambiguous data. Everytime when we run it during the low TPS time, it reported different high-utilization regions. However, in the past ~5 hours, everything worked normally with not even one suspicious TPS drop.

ranminecraft commented 2 months ago

same issue

electronicboy commented 2 months ago

you haven't ran save-off have you? otherwise, something is keeping millions of entity instances around, would really need to see a heap dump to see where it's coming from, not that I can offer much

RitaSister commented 2 months ago

you haven't ran save-off have you? otherwise, something is keeping millions of entity instances around, would really need to see a heap dump to see where it's coming from, not that I can offer much

I think nobody run /save-off. On my server only /save-on always

MrHua269 commented 2 months ago

Stress tested by using https://github.com/PureGero/minecraft-stress-test/ on the latest version but cannot reproduce. Stress tester start commandline: java -Dbot.count=200 -Dbot.ip=127.0.0.1 -Dbot.port=18750 -Dbot.login.delay.ms=5000 -Dbot.radius=10000 -jar minecraft-stress-test-1.0.0-SNAPSHOT-jar-with-dependencies.jar Plugins: cec95e67adfd644c7da0b284f488bcf8 Server start commandline: ./zulu22.30.13-ca-jdk22.0.1-linux_x64/bin/java -Xmx100G -Xms100G -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+HeapDumpOnOutOfMemoryError -XX:+UseLargePages -XX:LargePageSizeInBytes=2M -XX:+UseShenandoahGC -XX:-ShenandoahPacing -XX:+ParallelRefProcEnabled -XX:ShenandoahGCHeuristics=adaptive -XX:ShenandoahInitFreeThreshold=55 -XX:ShenandoahGarbageThreshold=30 -XX:ShenandoahMinFreeThreshold=20 -XX:ShenandoahAllocSpikeFactor=10 -Dchunky.maxWorkingCount=1000 -javaagent:authlib-injector-1.2.5.jar=https://littleskin.cn/api/yggdrasil -server -agentpath:/home/mrhua269/jprofiler14/bin/linux-x64/libjprofilerti.so=port=8849,nowait --add-modules=jdk.incubator.vector -jar luminol-paperclip-1.21.1-R0.1-SNAPSHOT-mojmap.jar nogui (Spark was fully disabled by using https://github.com/LuminolMC/Luminol/blob/dev/1.21.1/patches/server/0023-Force-disable-builtin-spark-plugin.patch)

tech-6 commented 2 months ago

you haven't ran save-off have you? otherwise, something is keeping millions of entity instances around, would really need to see a heap dump to see where it's coming from, not that I can offer much

I think nobody run /save-off. On my server only /save-on always

Someone will always use a command if it's available. For example save-off and save-on during backups makes sure that the server does not make any changes to whats going in the backup while it is being done.

Euphillya commented 2 months ago

Stress tested by using https://github.com/PureGero/minecraft-stress-test/ on the latest version but cannot reproduce. Stress tester start commandline: java -Dbot.count=200 -Dbot.ip=127.0.0.1 -Dbot.port=18750 -Dbot.login.delay.ms=5000 -Dbot.radius=10000 -jar minecraft-stress-test-1.0.0-SNAPSHOT-jar-with-dependencies.jar Plugins: cec95e67adfd644c7da0b284f488bcf8 Server start commandline: ./zulu22.30.13-ca-jdk22.0.1-linux_x64/bin/java -Xmx100G -Xms100G -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+HeapDumpOnOutOfMemoryError -XX:+UseLargePages -XX:LargePageSizeInBytes=2M -XX:+UseShenandoahGC -XX:-ShenandoahPacing -XX:+ParallelRefProcEnabled -XX:ShenandoahGCHeuristics=adaptive -XX:ShenandoahInitFreeThreshold=55 -XX:ShenandoahGarbageThreshold=30 -XX:ShenandoahMinFreeThreshold=20 -XX:ShenandoahAllocSpikeFactor=10 -Dchunky.maxWorkingCount=1000 -javaagent:authlib-injector-1.2.5.jar=https://littleskin.cn/api/yggdrasil -server -agentpath:/home/mrhua269/jprofiler14/bin/linux-x64/libjprofilerti.so=port=8849,nowait --add-modules=jdk.incubator.vector -jar luminol-paperclip-1.21.1-R0.1-SNAPSHOT-mojmap.jar nogui (Spark was fully disabled by using https://github.com/LuminolMC/Luminol/blob/dev/1.21.1/patches/server/0023-Force-disable-builtin-spark-plugin.patch)

You’re using a fork of Folia, it would be necessary to retest on a Folia server.

MineSunshineone commented 2 months ago

I ran the heap dump file with eclipse and this is the report generated heap-2024-08-06_08.06.19_Leak_Suspects.zip image

xymb-endcrystalme commented 2 months ago

One instance of me.lucko.spark.common.SparkPlatform loaded by org.bukkit.plugin.java.PluginClassLoader @ 0x22b4fbfe848 occupies 23,189,898,336 (32.88%) bytes. The memory is accumulated in one instance of java.lang.Object[], loaded by <system class loader>, which occupies 23,189,861,312 (32.88%) bytes.

22GB by Spark, huh? Go to plugins/spark/config.json and change backgroundProfiler to false.

MrHua269 commented 2 months ago

One instance of me.lucko.spark.common.SparkPlatform loaded by org.bukkit.plugin.java.PluginClassLoader @ 0x22b4fbfe848 occupies 23,189,898,336 (32.88%) bytes. The memory is accumulated in one instance of java.lang.Object[], loaded by <system class loader>, which occupies 23,189,861,312 (32.88%) bytes.

22GB by Spark, huh? Go to plugins/spark/config.json and change backgroundProfiler to false.

I tried, but seems it is still leaking

xymb-endcrystalme commented 2 months ago

What if you remove spark completely?

MineSunshineone commented 2 months ago

https://github.com/LuminolMC/Luminol/blob/dev/1.21.1/patches/server/0023-Force-disable-builtin-spark-plugin.patch)

https://github.com/LuminolMC/Luminol/blob/dev/1.21.1/patches/server/0023-Force-disable-builtin-spark-plugin.patch)

xymb-endcrystalme commented 2 months ago

https://github.com/PaperMC/Folia/blob/dev/1.21.1/patches/server/0018-Disable-spark-profiler.patch

MrHua269 commented 2 months ago

What if you remove spark completely?

Yeah, and I even removed all the hooks of spark

electronicboy commented 2 months ago

@MineSunshineone Leak suspect reports are generally useless, but, the spark plugin you've got installed is eating like 22G of ram

MineSunshineone commented 2 months ago

@MineSunshineone Leak suspect reports are generally useless, but, the spark plugin you've got installed is eating like 22G of ram

I know this. When I saw this report, I removed Spark and started the second test. There was still a memory leak. I guess it was because I didn’t completely remove the built-in Spark.

Potenza7 commented 2 months ago

Same issue If the server is up for a long time. TPS drops suddenly, usually around 5. Since I have a lot of RAM, this usually happens after 15-16 hours of up-time. It recovers after 10 seconds and then drops suddenly again. This problem does not get fixed until you restart the server. When you look at /tps, it shows meaningless values.

PedroMPagani commented 1 month ago

if anyone is interested: https://github.com/PaperMC/Folia/issues/283 here's the cause.