segfault when using trainFromBuffer or trainFromBufferDirect

egg82 commented 1 year ago

When using Zstd.trainFromBuffer or Zstd.trainFromBufferDirect the JVM segfaults with the following message:

# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ff86ba2f870, pid=1909652, tid=1907032
#
# JRE version: OpenJDK Runtime Environment Corretto-17.0.6.10.1 (17.0.6+10) (build 17.0.6+10-LTS)
# Java VM: OpenJDK 64-Bit Server VM Corretto-17.0.6.10.1 (17.0.6+10-LTS, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64)
# Problematic frame:
# C  [libzstd-jni-1.5.4-18247428536747591362.dll+0x8f870]
#
# No core dump will be written. Minidumps are not enabled by default on client versions of Windows
#
# An error report file with more information is saved as:
# C:\Users\eggys\Servers\1.19.3\hs_err_pid1909652.log
#
# If you would like to submit a bug report, please visit:
#   https://github.com/corretto/corretto-17/issues/
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

The full report is available here. The contents exceed Github's length limit.

The code I'm (currently) using is below, although the same problem happens with trainFromBuffer which I took as an excuse to move to direct buffers. Unfortunately that method seems to be having issues, as well.

Relevant parts of ByteBufUtil.java:

    public static final ByteBufAllocator ALLOC = PooledByteBufAllocator.DEFAULT;

Relevant parts of ZstdDictionaryImpl.java:

    private static final Logger LOGGER = new GELFLogger(LoggerFactory.getLogger(ZstdDictionaryImpl.class));

    private int dictLen;
    private byte[] returnData;
    private final byte[] currentData = new byte[256_000];
    private final IntList sampleLengths = new IntArrayList();
    private final ByteBuf samples = Unpooled.buffer();

    public ZstdDictionaryImpl(@NotNull File file, byte @NotNull [] data) {
        this.file = file;
        this.validFile = isValid(file);
        this.fileData = new byte[data.length];
        this.returnData = new byte[data.length];
        this.dictLen = data.length;

        System.arraycopy(data, 0, this.fileData, 0, data.length);
        System.arraycopy(data, 0, this.currentData, 0, data.length);
        System.arraycopy(data, 0, this.returnData, 0, data.length);

        LOGGER.debug("Created dictionary for file \"{}\" with {} bytes of data", file.getAbsolutePath(), data.length);
        if (!validFile) {
            LOGGER.debug("Created dictionary from invalid file \"{}\"", file.getAbsolutePath());
        }
    }

            ByteBuf dictBuffer = ByteBufUtil.ALLOC.buffer(this.currentData.length);
            try {
                dictBuffer.writerIndex(this.currentData.length); // readerIndex needs to be <= writerIndex
                //dictBuffer.readerIndex(this.currentData.length); // nioBuffer uses readerIndex for its capacity (this is incorrect)
                dictBuffer.readerIndex(0);
                ByteBuffer db = dictBuffer.nioBuffer();
                db.rewind();
                long result = Zstd.trainFromBufferDirect(samples.nioBuffer(), sampleLengths.toIntArray(), db);

                if (Zstd.isError(result)) {
                    LOGGER.error("Zstd training error: {} (limit: {}, position: {}, remaining: {})", Zstd.getErrorName(result), db.limit(), db.position(), db.remaining());
                } else if ((int) result > 0) {
                    db.rewind();
                    db.get(this.currentData, 0, (int) result);

                    this.dictLen = (int) result;
                    System.arraycopy(this.currentData, 0, this.returnData, 0, this.dictLen);
                } else {
                    LOGGER.error("Zstd training result length is invalid: {}", (int) result);
                }
            } finally {
                if (!dictBuffer.release(dictBuffer.refCnt())) {
                    LOGGER.warn("Temp dictionary buffer was not able to be released. Ref: {}", dictBuffer.refCnt());
                }
            }

luben commented 1 year ago

trainFromBufferDirect requires samples and output dict to be direct byte buffer. Can you check you are not providing non-direct byte buffers?

egg82 commented 1 year ago

trainFromBufferDirect requires samples and output dict to be direct byte buffer. Can you check you are not providing non-direct byte buffers?

Somehow I forgot to add a critical line in my original post, the actual samples buffer. Thanks for reminding me! I've also edited the original post with the new line, below.

private final ByteBuf samples = Unpooled.buffer();

Looks like we use directBuffer instead of buffer elsewhere in our codebase where we use Zstd.compressDirectByteBufferFastDict, Zstd.compressDirectByteBuffer, Zstd.decompressDirectByteBufferFastDict, and Zstd.decompressDirectByteBuffer. That might help.

I'll fix that and get back to you with the result, but I wanted to mention something while I do that:

This same segfault also happens while using trainFromBuffer - a full report of that particular segfault can be found here. The code for that is similar to the above, except it uses byte arrays instead. Here is a screenshot of the diff.

egg82 commented 1 year ago

Looks like using Unpooled.directBuffer and ByteBufUtil.ALLOC.directBuffer fixed it! Still curious about the segfault while using Zstd.trainFromBuffer however.

luben commented 1 year ago

I am not sure why this happens. Here is quick check and it does work as expected:

scala> import scala.io._; import java.nio._; import com.github.luben.zstd.Zstd
...

scala> def source = Source.fromFile("src/test/resources/xml")(Codec.ISO8859).map{_.toByte}
def source: Iterator[Byte]

scala> val src = source.sliding(1024, 1024).take(1024).map(_.toArray)
val src: Iterator[Array[Byte]] = <iterator>

scala> var arr = new Array[Array[Byte]](1024)
...

scala> var i = 0
var i: Int = 0

scala> for (sample <- src) { arr(i) = sample; i+=1 }

scala> var dict = new Array[Byte](32*1024)
...

scala> Zstd.trainFromBuffer(arr, dict, false)
val res8: Long = 13359

scala> Zstd.trainFromBuffer(arr, dict, true)
val res9: Long = 32768

scala> dict
val res13: Array[Byte] = Array(55, -92, 48, -20, 93, -43, 24, 89, 56, 16, -72, -110, 40, 29, -4, -1, -1, -1, -1, -1, -1, -1, 95, 32, 65, 26, -39, 123, 117, -109, -101, 68, 106, 119, 35, -62, -26, 107, -116, 23, -3, 1, -48, 71, -56, 31, -59, -56, 104, 19, 49, -43, 16, 2, ...

egg82 commented 1 year ago

Huh, very odd. Might just be my configuration somehow? Coretto on Win 10 64-bit? Not sure.

luben commented 1 year ago

Yes, I also use Correto 11 but on Linux

luben commented 1 year ago

I think there was an issue with upstream, when the samples were less than 10. I put some guard-rails so that this not happen anymore with https://github.com/luben/zstd-jni/commit/253adafe345917b1221b2b285bb92def1e38d2af

luben / zstd-jni

segfault when using trainFromBuffer or trainFromBufferDirect #248