Open donatelloOo opened 5 months ago
@donatelloOo This looks to be some low level bug where we write & read the data from the segment. Is it possible for us to get some extra information on how to reproduce issue?
Also, if you can share the table config & schema, it would be really helpful for further investigation.
May be related to #12286. I would suggest the same thing I suggested there. Could you try again but running Pinot with Java 17 or 21? Alternatively, could you change pinot.offheap.buffer.factory
to org.apache.pinot.segment.spi.memory.unsafe.UnsafePinotBufferFactory
? That change should be applied in the pinot-server.conf
file.
This change would probably not fix the issue but may prevent the SIGSEV.
Hi @gortiz, thanks for your answer. We already tried using java-17 amazon corretto but we got same issue. I will try with the UnsafePinotBufferFactory and provide the feedback soon.
@snleee I can work on a reproducer with shareable data/schema, I will let you know when it's available.
The only way to limit such errors seems to add JIT compiler exclusions like below:
-XX:CompileCommand=exclude,org.apache.pinot.segment.spi.index.mutable.MutableForwardIndex::getDictId
-XX:CompileCommand=exclude,org.apache.pinot.segment.spi.memory.PinotByteBuffer::getInt
-XX:CompileCommand=exclude,org.apache.pinot.segment.spi.memory.PinotByteBuffer::getLong
-XX:CompileCommand=exclude,org.apache.pinot.segment.local.segment.readers.PinotSegmentColumnReader::getValue
-XX:CompileCommand=exclude,org.apache.pinot.segment.local.segment.creator.impl.SegmentColumnarIndexCreator::init
-XX:CompileCommand=exclude,org.apache.pinot.segment.local.realtime.impl.forward.FixedByteSVMutableForwardIndex::getDictId
-XX:CompileCommand=exclude,org.apache.pinot.segment.local.realtime.impl.forward.FixedByteSVMutableForwardIndex$ReaderWithOffset::getInt
-XX:CompileCommand=exclude,org.apache.pinot.segment.local.realtime.impl.forward.FixedByteSVMutableForwardIndex$ReaderWithOffset::getLong
-XX:CompileCommand=exclude,org.apache.pinot.segment.local.realtime.impl.forward.FixedByteSVMutableForwardIndex::getLong
-XX:CompileCommand=exclude,org.apache.pinot.segment.local.realtime.impl.dictionary.LongOffHeapMutableDictionary::get
-XX:CompileCommand=exclude,org.apache.pinot.segment.local.realtime.impl.dictionary.LongOffHeapMutableDictionary::getLongValue
-XX:CompileCommand=exclude,org.apache.pinot.segment.local.io.reader.impl.FixedByteSingleValueMultiColReader::getInt
-XX:CompileCommand=exclude,org.apache.pinot.segment.local.io.reader.impl.FixedByteSingleValueMultiColReader::getLong
-XX:CompileCommand=exclude,java.nio.DirectByteBuffer::getInt
-XX:CompileCommand=exclude,jdk.internal.misc.ScopedMemoryAccess::getIntUnaligned
-XX:CompileCommand=exclude,jdk.internal.misc.ScopedMemoryAccess::getIntUnalignedInternal
...
While ingesting real-time data into a single table we are periodically getting SIGSEGV fatal errors on all servers.
Context:
Table is composed of:
Below is a short extract of the core dump log (full one is attached).
See full core dump here: SIGSEGV-obf.log
How to investigate deeper ?