SpigotMC / BungeeCord

BungeeCord, the 6th in a generation of server portal suites. Efficiently proxies and maintains connections and transport between multiple Minecraft servers.
https://www.spigotmc.org/go/bungeecord
Other
1.56k stars 1.1k forks source link

Bungeecord Stops Working ⚠ Randomly (no log, timeout) #3358

Open thefourcraft opened 2 years ago

thefourcraft commented 2 years ago

Bungeecord version

running BungeeCord version git:BungeeCord-Bootstrap:1.19-R0.1-SNAPSHOT:587fb37:1653 by md_5

Server version

git-Purpur-1632 (MC: 1.18.2)

Client version

1.18.2

Bungeecord plugins

BungeePackFix KarmaAPI Platform [RD] LockLogin LuckPerms ➥ Pixel MOTD | 1.8 - 1.19 SkinsRestorer Triton - Translate Your Server Plan | Player Analytics

The bug

Intradtction ​

Bungeecord randomly stops accepting incoming connections after a period of time. This can vary from every 2 hours to every 2 days+, when it happens I have to kill bungee and start it back up again and it is fine once again until it stops accepting incoming connections again. This also means that my server appears offline to anyone trying to join but anyone who was on the server before it did this is still on the server.

More Info

It doesn't kick players, all players on the server can use Bungeecord plugin commands and change servers, but new players can't join. They see the server as offline. or online sometimes but they get timed out. I am not the only one who reported it and this is not a new issue, it's happening a lot but seems like no one has an accurate solution to this problem. it might be a plugin that is causing this to happen, it's not a network issue because localhost is affected also. this also affects the local host, when this problem is happening even localhost can't join but players are online and playing on the server. and also this is random.

Different Types of the bug

  1. Server on, new players can't join
  2. Server on, players get timed out randomly
  3. Server on, but appears offline
  4. Server on, but frozen (no log, or pine logs)
  5. Server on, but bungee doesn't accept commands (/end and /bungee)

    Things I Tried To Do

    • Try on a new server
    • Try with new files
    • Starting with one plugin at the time (which didn't help it), all the plugins on the list have a lot of downloads and work with no problems on other server networks, now there might be something that I didn't catch or see.
    • Starting with no plugins (works for 45 hours then same problem)

Similar Problems/Issues

Stops accepting incoming connections #171 [Ping Handler] -> - read timed out #1959 Timed Out kicks #2271 Bungeecord becomes unresponsive to any connections or commands #2644 Bungee randomly kicks players & stops responding, but it is alive #2984 BungeeCord Stops Listening for new connections #3276 - my issue (network problem) Bungeecord read timed out - SpigotMC

Log output (links)

log

Bungeecord https://mclo.gs/w0QsHtx - full log

Configs

Bungeecord | Purper Server

Checking

thefourcraft commented 2 years ago

Update 11/7/2022

So, Sometimes the console would give this error...

at net.md_5.bungee.connection.InitialHandler.handle(InitialHandler.java:451)
at net.md_5.bungee.protocol.packet.LoginRequest.handle(LoginRequest.java:46)
at net.md_5.bungee.netty.HandlerBoss.channelRead(HandlerBoss.java:114)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:327)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:299)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.ByteToMessageDecoder.handlerRemoved(ByteToMessageDecoder.java:255)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:517)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:995)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:833)

I think that one or more plugins are trying to use the chat or something?

Janmm14 commented 2 years ago

@thefourcraft The first line of that error is missing.

thefourcraft commented 2 years ago

@thefourcraft The first line of that error is missing.

this is the full log https://mclo.gs/BPka5Ce

Janmm14 commented 2 years ago

lockLogin error: downloadUrl is null Fix yoir lock login config or send that log to locklogin plugin author and hope he fixes it.

After bungee stops accepting connections, a thread dump should usually show useful information. You can use the jstack command line tool or java visualvm to obtain it.

thefourcraft commented 2 years ago

I will tell him about this, but without this plugin, bungee still doesn't accept players basically same error 🔼 https://github.com/SpigotMC/BungeeCord/issues/3358#issue-1299967265

thefourcraft commented 2 years ago

After Bungee stops accepting connections this is the thread dump, IDK if I did it in the correct way but this is it https://mclo.gs/GWrLeDj @Janmm14

The bungee is completely frozen... like its off

Janmm14 commented 2 years ago

@thefourcraft commented: After Bungee stops accepting connections this is the thread dump, IDK if I did it in the correct way but this is it https://mclo.gs/GWrLeDj @Janmm14

The bungee is completely frozen... like its off

Nothing what runs in the java process your did the thread dump on is from bungeecord. I think you chose the wrong process. You need vserver (for running jstack in vserver console/putty/openssh) OR access to startup parameters of the bungee to remotely attach visualvm, see here for a tutorial: https://www.baeldung.com/visualvm-jmx-remote

There is no "Bungeecord Logger Thread", there is no "Metrics Thread", no "main" thread. No netty thread for any running connections, No tcp accept thread. bungee is not mentioned in that thread dump at all.

thefourcraft commented 2 years ago

@Janmm14 This is a thread dump after the bug is happening on the server: https://mclo.gs/BeB6fPf me and 2 people are on the server... one is localhost the other one is me from a public IP and another one from internal IP I got timed out with the localhost player and we can't join the server but the players that are still online are able to make commands and has no lags or problems.

Anther Dump after the server froze completely and kicked all the players

  1. https://www.toptal.com/developers/hastebin/aterefebav.properties
  2. https://www.toptal.com/developers/hastebin/icijuwopic.properties
thefourcraft commented 2 years ago

Erors in the log

finally some errors and something that indicates that we have a problem or something disconnected with: SocketException : Connection reset @ sun.nio.ch.SocketChannelImpl:394

this is an error that appears on the screen for some players

witch means the the server kills the connection, or something

thefourcraft commented 2 years ago

Update 7/14/2022

After more investigation, I found that this might be something wrong with The dashboard that we use to manage all of our servers didn't hook correctly with java and sometimes would crash completely. And another problem was discovered in our players' IP hide system... since then (7/13/2022) we had no such issue 🔝

About Bungee 🌐

I am not sure why the localhost didn't work also, this might indicate that there is still a problem with bungeecord, also when doing the thread dumps we didn't use our dashboard we ran the test on a fresh PC... and a new java installation so there might be a bug with bungeecord.

About Locklogin 🔒

Locklogin wasn't the cause for our problems, the update system with git or something was broken and we found an unrelated bug with VIrtual IDs. (internal plugin function to protect against MySQL injections)


thanks @Janmm14 for all the help (altho the problem isn't quite clear 😆)

thefourcraft commented 2 years ago

Update 7/14/2022 | 20:00

some replayed my question on the forums 🥳 and also we had a misconfiguration with the forested hosts & more things:

  forced_hosts:
    localhost:25567: hub

  priorities:
  - Hub

server_connect_timeout: 5000

timeout: 5000

we also are following the recommendations by that guy:

If it's not a problem with your host or some bandwidth limitation it has to be a plugin it could be Player Analytics I also don't recommend having luckperms on bungeecord for various reasons

xism4 commented 2 years ago

Update 7/14/2022 | 20:00

some replayed my question on the forums 🥳 and also we had a misconfiguration with the forested hosts & more things:

  forced_hosts:
    localhost:25567: hub

  priorities:
  - Hub

server_connect_timeout: 5000

timeout: 5000

we also are following the recommendations by that guy:

If it's not a problem with your host or some bandwidth limitation it has to be a plugin it could be Player Analytics I also don't recommend having luckperms on bungeecord for various reasons

BungeeCord by default does not have that issue, reproduce it on vanilla without plugins etc..