apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.39k stars 1.26k forks source link

TEXT_CONTAINS fails with NullPointerException #10934

Open zhuangdaz opened 1 year ago

zhuangdaz commented 1 year ago

I have a col enabled with native text index and my query fails with the following exception(more detail logs):

[
  {
    "errorCode": 200,
    "message": "QueryExecutionError:\njava.lang.RuntimeException: Caught exception while running query: .*html.*\n\tat org.apache.pinot.segment.local.segment.index.readers.text.NativeTextIndexReader.getDocIds(NativeTextIndexReader.java:111)\n\tat org.apache.pinot.core.operator.filter.TextContainsFilterOperator.getNextBlock(TextContainsFilterOperator.java:52)\n\tat org.apache.pinot.core.operator.filter.TextContainsFilterOperator.getNextBlock(TextContainsFilterOperator.java:37)\n\tat org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:43)\n...\nCaused by: java.lang.NullPointerException"
  }
]

Query

select
  indexedString0
from unified_events
where (
    TEXT_CONTAINS (indexedString0, '.*selling.*')
    or TEXT_CONTAINS (indexedString0, '.*force.*')
)

Not sure if it is introduced in version 0.12.1 since the same query worked fine with 0.12.0. (slack discussion)

Jackie-Jiang commented 1 year ago

@zhuangdaz Are you running the latest master branch or release version 0.12.1? There should be no related changes between 0.12.0 and 0.12.1

zhuangdaz commented 1 year ago

@Jackie-Jiang we are running on release version 0.12.1. I am not 100% confident it is due to version change as we have ingested more data onto the cluster. So it is also possible some new data hits this exception that is not uncovered before. Some extra context is - LIKE (indexedString0, '%selling%') is running fine but it requires a full scan instead of using the text index. Hope this is helpful.

zhuangdaz commented 1 year ago

@Jackie-Jiang it looks it happens when it tries to retrieve a data buffer. More detailed log: https://gist.github.com/zhuangdaz/1ebef51f28ba87bb3f420201f2d12fe0

Jackie-Jiang commented 1 year ago

Per the stack trace, the exception is thrown from the native text index. @atris Can you help take a look? Seems ImmutableFST is using the OffHeapMutableBytesStore which seems incorrect to me. We shouldn't need a mutable data structure to store immutable index.

atris commented 1 year ago

I am looking into this