EsotericSoftware / kryonet

TCP/UDP client/server library for Java, based on Kryo
BSD 3-Clause "New" or "Revised" License
1.81k stars 415 forks source link

IllegalArgumentException in Server selector thread Kills Server. #139

Closed samthaku closed 6 years ago

samthaku commented 6 years ago

Hi we have recently faced an issue with Kryo, we are using our own custom serialization based on SBE, plugged into Kryo.

We are trying to send the messages and have seen IllegalArgumentException occurring at line 188 in TcpConnection.java

The code at line 188 is below.

writeBuffer.position(writeBuffer.position() + lengthLength)

We need your confirmation on the following:

1.What we believe is that the buffer was full at the time when Kryo was trying to reserve 4 bytes for positon and because the buffer is already full, it is trying set position more than the size of the buffer. is this correct ?

2.We feel that there should have been a condition to check if buffer has enough capacity before setting position. a )Why such condition doesn't exist? b) are we using API in a wrong manner ?

3.Or there can be something else we are missing in our serialization. Please share your thoughts?

NathanSweet commented 6 years ago

It is assumed the write buffer is large enough. You can either send less and/or less often, size the buffer larger to match what you are sending, or accept that when the write buffer overflows the connection is probably dead. You can turn on debug or trace logging to see how much of the write buffer is used.

What would such a check do if the buffer is full? TcpConnection sendTCP can throw an exception from a write buffer overflow not just while setting the position for a message length, but also during serialization. It could throw a better exception, but not much else. I will move setting the position inside the try below, so it shows a nicer error, and catch Throwable instead. Connection sendTCP prevents IOException and KryoNetException from reaching the caller.

ghost commented 6 years ago

Nathan lets assume we have 1 slow client, which caused our buffer to get full, while every other client is living just fine.

Kryo might execute this on Server thread itself (when sending a heartbeat for example), in which case it will get to that line, see that the buffer is full, then die with a "better exception" and all our clients will experience downtime... because of one slow client. Also note although clients are smart enough to keep attempts to reconnect, server is not, it will simply die...

ghost commented 6 years ago

Or lets say we were aware enough to catch that exception, what are we supposed to do? we cant "fix" the selector thread its already out of the loop, so only option is to restart kryo server and again bother other clients. It is easy to guard against this in Serialization instance, we simply check that we never write beyond buffer.cap - 4... but that means user must know implementation details of the selector. Imho whatever kryoserver does, except dying is good. Like disconnect that slow client?

If you consider recovering i can make this change and send a pull request.