Open jadami10 opened 8 months ago
From the stack trace, seems like it is caused by the index mismatching the segment metadata, where metadata stores the column as INT type, but the actual index is in a different type. Can you try doing a binary search on the segments (by applying filter on $segmentName
) and see if this exception is thrown from all segments or just a few segments. Then we can find a bad segment, and try to find the mismatched column
It's not all segments, just the ones where the column did not exist before it was added. I have an example of a server + segment. And I can clearly see in the index_map
that .dictionary.startOffset
and .dictionary.size
are set. Reloading the segment doesn't change it. And forceDownload
doesn't work because it's not supported for REALTIME tables.
What's interesting is a new server that came online failed to similarly download the segment from deepstore. but resetting the segment then fixed it
[2023-10-31 15:05:01.898548] ERROR [SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel] [HelixTaskExecutor-message_handle_thread_23:28] Caught exception in state transition from OFFLINE -> ONLINE for resource: <table_name>, partition: <segment_name>
[2023-10-31 15:05:01.898586] java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: Caught exception while adding segment: <segment_name>, table: <table_name>
[2023-10-31 15:05:01.898905] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.downloadSegmentFromDeepStore(RealtimeTableDataManager.java:598) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.898941] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.downloadAndReplaceSegment(RealtimeTableDataManager.java:564) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.898967] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:418) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.898995] at org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addRealtimeSegment(HelixInstanceDataManager.java:219) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899027] at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:168) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899041] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
[2023-10-31 15:05:01.899056] at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
[2023-10-31 15:05:01.899071] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
[2023-10-31 15:05:01.899083] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
[2023-10-31 15:05:01.899116] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:350) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899152] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:278) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899175] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899202] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899224] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
[2023-10-31 15:05:01.899237] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
[2023-10-31 15:05:01.899250] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
[2023-10-31 15:05:01.899261] at java.lang.Thread.run(Thread.java:829) [?:?]
[2023-10-31 15:05:01.899294] Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Caught exception while adding segment: <segment_name>, table: <table_name>
[2023-10-31 15:05:01.899326] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.replaceLLSegment(RealtimeTableDataManager.java:676) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899354] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.untarAndMoveSegment(RealtimeTableDataManager.java:617) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899383] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.downloadSegmentFromDeepStore(RealtimeTableDataManager.java:595) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899389] ... 16 more
[2023-10-31 15:05:01.899411] Caused by: java.lang.RuntimeException: Caught exception while adding segment: <segment_name>, table: <table_name>
[2023-10-31 15:05:01.899440] at org.apache.pinot.segment.local.upsert.BasePartitionUpsertMetadataManager.doAddSegment(BasePartitionUpsertMetadataManager.java:147) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899466] at org.apache.pinot.segment.local.upsert.BasePartitionUpsertMetadataManager.addSegment(BasePartitionUpsertMetadataManager.java:101) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899497] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.handleUpsert(RealtimeTableDataManager.java:536) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899528] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:493) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899560] at org.apache.pinot.core.data.manager.BaseTableDataManager.addSegment(BaseTableDataManager.java:231) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899585] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.replaceLLSegment(RealtimeTableDataManager.java:674) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899616] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.untarAndMoveSegment(RealtimeTableDataManager.java:617) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899647] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.downloadSegmentFromDeepStore(RealtimeTableDataManager.java:595) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899656] ... 16 more
[2023-10-31 15:05:01.899669] Caused by: java.lang.UnsupportedOperationException
[2023-10-31 15:05:01.899703] at org.apache.pinot.segment.spi.index.reader.ForwardIndexReader.getInt(ForwardIndexReader.java:372) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899742] at org.apache.pinot.segment.local.segment.readers.PinotSegmentColumnReader.getValue(PinotSegmentColumnReader.java:111) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899766] at org.apache.pinot.segment.local.upsert.UpsertUtils.getValue(UpsertUtils.java:150) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899803] at org.apache.pinot.segment.local.upsert.UpsertUtils$PrimaryKeyReader.getPrimaryKey(UpsertUtils.java:127) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899832] at org.apache.pinot.segment.local.upsert.UpsertUtils$RecordInfoReader.getRecordInfo(UpsertUtils.java:101) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899857] at org.apache.pinot.segment.local.upsert.UpsertUtils$1.next(UpsertUtils.java:53) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899886] at org.apache.pinot.segment.local.upsert.UpsertUtils$1.next(UpsertUtils.java:43) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899925] at org.apache.pinot.segment.local.upsert.ConcurrentMapPartitionUpsertMetadataManager.addOrReplaceSegment(ConcurrentMapPartitionUpsertMetadataManager.java:80) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899954] at org.apache.pinot.segment.local.upsert.BasePartitionUpsertMetadataManager.addSegment(BasePartitionUpsertMetadataManager.java:172) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.899982] at org.apache.pinot.segment.local.upsert.BasePartitionUpsertMetadataManager.doAddSegment(BasePartitionUpsertMetadataManager.java:144) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.900020] at org.apache.pinot.segment.local.upsert.BasePartitionUpsertMetadataManager.addSegment(BasePartitionUpsertMetadataManager.java:101) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.900057] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.handleUpsert(RealtimeTableDataManager.java:536) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.900089] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:493) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.900120] at org.apache.pinot.core.data.manager.BaseTableDataManager.addSegment(BaseTableDataManager.java:231) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.900149] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.replaceLLSegment(RealtimeTableDataManager.java:674) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.900178] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.untarAndMoveSegment(RealtimeTableDataManager.java:617) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.900214] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.downloadSegmentFromDeepStore(RealtimeTableDataManager.java:595) ~[pinot-all-0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-jar-with-dependencies.jar:0.13.0-2023-07-21-f4e5f3de37-SNAPSHOT-f4e5f3de37ca02168579691d45303db1129cb4a6]
[2023-10-31 15:05:01.900220] ... 16 more
We got to this error by
noDictionaryColumns
Now, any queries that hit that json column on segments that were created before the column existed see
If you query the table with
where <json_column> is NOT NULL
, then the queries do succeed. So there is at least a workaround.This cluster is on a version of master from May 2023. I didn't see any issues/fixes related searching for elements in this stack trace.