Open Jayesh-Asrani opened 4 months ago
@xiangfu0 I think your PR https://github.com/apache/pinot/pull/13630 is for https://github.com/apache/pinot/issues/13604 and not this one.
I tried to reproduce the issue mentioned here, but the behaviour is as expected. Ingestion fails (when ingesting bad data "data-bad.json"), and the segment remains in the CONSUMING state. But once there additional records available it does transition to ONLINE state.
See attached files used for investigation : pr_13626-config.json pr_13626-schema.json data-bad.json data-good.json data-good-2.json
I noticed that I had not included a sortedColumn in the table config in my previous comment. I added "studentID" as the sortedColumn in the tableIndexConfig, and even in that case the segment transitions to ONLINE state once it has sufficient entries.
e.g.
"A: "[1,2,3]"
2024/07/16 02:56:02.708 ERROR [MutableSegmentImpl_events_v11__127__157__20240715T1915Z_raw_data_analysis_poc] [events_v11__127__157__20240715T1915Z] failed to index value with inverted_index java.lang.IndexOutOfBoundsException: Index 1130 out of bounds for length 10 at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100) ~[?:?] at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106) ~[?:?] at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302) ~[?:?] at java.base/java.util.Objects.checkIndex(Objects.java:385) ~[?:?] at java.base/java.util.ArrayList.get(ArrayList.java:427) ~[?:?] at org.apache.pinot.segment.local.realtime.impl.invertedindex.RealtimeInvertedIndex.add(RealtimeInvertedIndex.java:60) ~[startree-pinot-all-1.2.0-ST.40-jar-with-dependencies.jar:1.2.0-ST.40-7689a6d2a3afecbda1413a231e895717cd937513] at org.apache.pinot.segment.spi.index.mutable.MutableInvertedIndex.add(MutableInvertedIndex.java:30) ~[startree-pinot-all-1.2.0-ST.40-jar-with-dependencies.jar:1.2.0-ST.40-7689a6d2a3afecbda1413a231e895717cd937513] at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl.addNewRow(MutableSegmentImpl.java:707) ~[startree-pinot-all-1.2.0-ST.40-jar-with-dependencies.jar:1.2.0-ST.40-7689a6d2a3afecbda1413a231e895717cd937513] at org.apache.pinot.segment.local.indexsegment.mutable.MutableSegmentImpl.index(MutableSegmentImpl.java:533) ~[startree-pinot-all-1.2.0-ST.40-jar-with-dependencies.jar:1.2.0-ST.40-7689a6d2a3afecbda1413a231e895717cd937513] at org.apache.pinot.core.data.manager.realtime.RealtimeSegmentDataManager.processStreamEvents(RealtimeSegmentDataManager.java:631) ~[startree-pinot-all-1.2.0-ST.40-jar-with-dependencies.jar:1.2.0-ST.40-7689a6d2a3afecbda1413a231e895717cd937513] at org.apache.pinot.core.data.manager.realtime.RealtimeSegmentDataManager.consumeLoop(RealtimeSegmentDataManager.java:473) ~[startree-pinot-all-1.2.0-ST.40-jar-with-dependencies.jar:1.2.0-ST.40-7689a6d2a3afecbda1413a231e895717cd937513] at org.apache.pinot.core.data.manager.realtime.RealtimeSegmentDataManager$PartitionConsumer.run(RealtimeSegmentDataManager.java:707) ~[startree-pinot-all-1.2.0-ST.40-jar-with-dependencies.jar:1.2.0-ST.40-7689a6d2a3afecbda1413a231e895717cd937513] at java.base/java.lang.Thread.run(Thread.java:1583) [?:?] 2024/07/16 02:56:02.710 ERROR [RealtimeSegmentDataManager_events_v11__73__157__20240715T1859Z] [events_v11__73__157__20240715T1859Z] Caught exception while indexing the record at offset: 479976251 , row: {
``