Open AlfieC opened 3 years ago
Can you show me how you call the code and the full stacktrace ? -2 is a valid error code (stream error)
Bye Norman
these errors only originate when we use this in the pipeline:
private final BungeeZlib zlib = CompressFactory.zlib.newInstance();
@Override
public void handlerAdded(ChannelHandlerContext ctx) throws Exception {
zlib.init(true, Deflater.DEFAULT_COMPRESSION);
}
@Override
public void handlerRemoved(ChannelHandlerContext ctx) throws Exception {
zlib.free();
}
@Override
protected void encode(ChannelHandlerContext ctx, ByteBuf msg, ByteBuf out) throws Exception {
int origSize = msg.readableBytes();
if (origSize < 256) {
writeVarInt(0, out);
out.writeBytes(msg);
} else {
writeVarInt(origSize, out);
zlib.process(msg, out);
}
}
public static void writeVarInt(int val, ByteBuf out) {
while ((val & -128) != 0) {
out.writeByte(val & 127 | 128);
val >>>= 7;
}
out.writeByte(val);
}
code from other side:
private final int compressionThreshold;
private final BungeeZlib zlib = CompressFactory.zlib.newInstance();
@Override
public void handlerAdded(ChannelHandlerContext ctx) throws Exception
{
zlib.init( false, 0 );
}
@Override
public void handlerRemoved(ChannelHandlerContext ctx) throws Exception
{
zlib.free();
}
@Override
protected void decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) throws Exception
{
int size = DefinedPacket.readVarInt( in );
if ( size == 0 )
{
out.add( in.slice().retain() );
in.skipBytes( in.readableBytes() );
} else
{
Preconditions.checkArgument( size >= compressionThreshold, "Decompressed size %s less than compression threshold %s", size, compressionThreshold);
ByteBuf decompressed = ctx.alloc().directBuffer();
try
{
zlib.process( in, decompressed );
Preconditions.checkArgument( decompressed.readableBytes() == size, "Decompressed size %s is not equal to actual decompressed bytes", size, decompressed.readableBytes());
out.add( decompressed );
decompressed = null;
} finally
{
if ( decompressed != null )
{
decompressed.release();
}
}
}
}
apologies, error here is this one:
Preconditions.checkArgument( size >= compressionThreshold, "Decompressed size %s less than compression threshold %s", size, compressionThreshold);
I'd normally attribute this to an error of ours but it only occurs when we use io_uring - no issues using epoll or nio
when we remove the checkArgument, there we get the -2 on zlib decompression
I've done some test a time ago, and it appears that the decoded buffer has multiple nio buffers, so that's why you got the -2 on Zlib cause it can't decompress multiple buffers
I've done some test a time ago, and it appears that the decoded buffer has multiple nio buffers, so that's why you got the -2 on Zlib cause it can't decompress multiple buffers
so is this a limitation of zlib? or bug?
I've done some test a time ago, and it appears that the decoded buffer has multiple nio buffers, so that's why you got the -2 on Zlib cause it can't decompress multiple buffers
thinking about this some more, im not sure why the issue only appears on io_uring - on epoll and nio no issue.
Actually I've done some more test. I have a custom fork of BungeeCord https://github.com/SpigotMC/BungeeCord with IOUring and a custom fork of PaperSpigot (https://github.com/PaperMC/Paper) with IOUring on. I just try to launch the bungeecord server with io_uring on and the spigot with io_uring on and it's not working. I got this error from the Bungeecord logs
[21:35:32] [Netty io_uring Worker #0/INFO]: [HookWood_] disconnected with: Exception Connecting:DecoderException : net.md_5.bungee.jni.NativeCodeException: Unknown z_stream return code : -3 @ io.netty.handler.codec.MessageToMessageDecoder:98
When I launch the spigot server on Epoll, it works. So I don't know why and I'm going to search more things on it, but the zlib compression don't work with IOUring on BungeeCord and Spigot
Can you provide a reproducer that I can run locally ?
@AlfieC @HookWoods ping
OK I will set up that when I'm at home (in 3-4h)
I've done some test a time ago, and it appears that the decoded buffer has multiple nio buffers, so that's why you got the -2 on Zlib cause it can't decompress multiple buffers
While CompositeByteBuf exists in netty it only gives us a native address when it just has one component, else it errors. So that cannot be the problem. It is not clear if this is a netty bug or a bug in bungeecord's zlib usage. Bungee's native zlib got an overhaul since the last comment in here in bungeecord, so the issue creator should check it again as well.
I'd suggest to close this issue.
It sounds like the only variable between working and non-working systems is the io_uring transport, and in this case the error shows up as a corrupted (I guess) zlib stream. It could be that the io_uring transport doesn't set correct read- or write-offsets on the buffers in some cases, and it just happens to get caught by zlib because it sanity checks the data it gets.
any update on this?
This seems to still be an issue.
hey,
we have a pretty large codebase so I'll try to pull out the most important parts
we essentially dropped in epoll support whereas before we used nio - no issue there. we later deployed kernel 5 + io_uring with the io_uring module, and we started to have issues. we compress the network stream data with zlib (mostly implemented native to avoid copying bytebuf)
error flows from here: https://github.com/SpigotMC/BungeeCord/blob/master/native/src/main/c/NativeCompressImpl.cpp#L76
always errors showing -2. I would usually attribute this to a bug on our side, but the issue only surfaces when we put the "proxy" type server on io_uring, as epoll and nio work without issue. not sure what kind of logs you guys need but I can try to provide anything requested.