Stratio / cassandra-lucene-index

Lucene based secondary indexes for Cassandra
Apache License 2.0
600 stars 170 forks source link

ArrayIndexOutOfBounds with collections of UDTs #400

Open mark1ewis opened 6 years ago

mark1ewis commented 6 years ago

Cassandra Lucene Index version 3.11.1.0 Cassandra Version: 3.11.3

I have a consistently reproducible test case that throws an ArrayIndexOutOfBoundsException at query time, if each of the following is true (I've tested and taking out any item on this list causes it to work):

Note that the first two items on this list sound like #395, the last item is similar to #394, but this issue exposes the symptoms a bit differently. There may be a similar root cause for all 3 issues.

Here is a simple script to reproduce the problem:

CREATE TYPE child (hobby text);
CREATE TABLE test_table (name text, age int, kids list<frozen<child>>, primary key(name));

INSERT INTO test_table (name, age, kids) VALUES ('mark', 41, [ { hobby: 'programming' } ]);
INSERT INTO test_table (name, age, kids) VALUES ('jane', 39, [ { hobby: 'shopping' } ]);

CREATE CUSTOM INDEX test_index ON test_table ()
USING 'com.stratio.cassandra.lucene.Index'
WITH OPTIONS = {
  'refresh_seconds': '1',
  'schema': '{
    fields: {
      name: { type: "string" },
      age: { type: "integer" },
      "kids.hobby": { type: "string" }
    }
  }'
};

SELECT name, age FROM test_table WHERE expr(test_index, '{
   filter: {type: "prefix", field: "name", value: "mark"},
   sort: { field: "age" }
}');

Expected Result

A single row is returned

Actual Result

ServerError: java.lang.ArrayIndexOutOfBoundsException is reported in cqlsh

Variants that do not break

Remove the sort

SELECT name, age FROM test_table WHERE expr(test_index, '{
   filter: {type: "prefix", field: "name", value: "mark"}
}');

This returns the expected single row.

Select the collection of UDTs

SELECT * FROM test_table WHERE expr(test_index, '{
   filter: {type: "prefix", field: "name", value: "mark"},
   sort: { field: "age" }
}');

This returns the expected single row.

SELECT kids FROM test_table WHERE expr(test_index, '{
   filter: {type: "prefix", field: "name", value: "mark"},
   sort: { field: "age" }
}');

This also returns the expected single row.

Do not index the collection of UDT's

CREATE CUSTOM INDEX test_index ON test_table ()
USING 'com.stratio.cassandra.lucene.Index'
WITH OPTIONS = {
  'refresh_seconds': '1',
  'schema': '{
    fields: {
      name: { type: "string" },
      age: { type: "integer" }
    }
  }'
};

The issue is not reproducible with the above index definition that just omits the collection of UDT's

mark1ewis commented 6 years ago

Almost forgot to include the stack trace. When I run the example listed above, here is the stack trace printed in system.log on the coordinator node:

ERROR [Native-Transport-Requests-1] 2018-08-30 21:33:20,387 QueryMessage.java:129 - Unexpected error during query
java.lang.ArrayIndexOutOfBoundsException: 0
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper$.$anonfun$columns$8(ColumnsMapper.scala:215) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper$.$anonfun$columns$8$adapted(ColumnsMapper.scala:214) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper$$$Lambda$504/518807921.apply(Unknown Source) ~[na:na]
        at scala.collection.TraversableOnce.$anonfun$foldRight$1(TraversableOnce.scala:162) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at scala.collection.AbstractIterator.foldRight(Iterator.scala:1409) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at scala.collection.AbstractIterable.foldRight(Iterable.scala:54) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at scala.collection.AbstractTraversable.$colon$bslash(Traversable.scala:104) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper$.columns(ColumnsMapper.scala:214) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper$.columns(ColumnsMapper.scala:173) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper$.columns(ColumnsMapper.scala:143) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper.columns(ColumnsMapper.scala:119) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper.$anonfun$columns$4(ColumnsMapper.scala:104) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper$$Lambda$503/1181987260.apply(Unknown Source) ~[na:na]
        at scala.collection.TraversableOnce.$anonfun$foldRight$1(TraversableOnce.scala:162) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at scala.collection.AbstractIterator.foldRight(Iterator.scala:1409) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at scala.collection.AbstractIterable.foldRight(Iterable.scala:54) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at scala.collection.AbstractTraversable.$colon$bslash(Traversable.scala:104) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper.columns(ColumnsMapper.scala:103) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper.$anonfun$columns$3(ColumnsMapper.scala:89) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper$$Lambda$501/448380742.apply(Unknown Source) ~[na:na]
        at scala.collection.TraversableOnce.$anonfun$foldRight$1(TraversableOnce.scala:162) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at scala.collection.AbstractIterator.foldRight(Iterator.scala:1409) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at scala.collection.AbstractIterable.foldRight(Iterable.scala:54) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at scala.collection.AbstractTraversable.$colon$bslash(Traversable.scala:104) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper.columns(ColumnsMapper.scala:87) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.mapping.ColumnsMapper.columns(ColumnsMapper.scala:56) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.IndexPostProcessor.document(IndexPostProcessor.scala:141) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.IndexPostProcessor.$anonfun$top$1(IndexPostProcessor.scala:106) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.IndexPostProcessor$$Lambda$498/1255448657.apply$mcVI$sp(Unknown Source) ~[na:na]
        at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:156) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.IndexPostProcessor.top(IndexPostProcessor.scala:103) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.IndexPostProcessor.process(IndexPostProcessor.scala:57) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.ReadCommandPostProcessor.apply(IndexPostProcessor.scala:168) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.ReadCommandPostProcessor.apply(IndexPostProcessor.scala:161) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at org.apache.cassandra.db.PartitionRangeReadCommand.postReconciliationProcessing(PartitionRangeReadCommand.java:408) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:2291) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.db.PartitionRangeReadCommand.execute(PartitionRangeReadCommand.java:263) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at com.stratio.cassandra.lucene.IndexQueryHandler.executeSortedLuceneQuery(IndexQueryHandler.scala:226) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.IndexQueryHandler.executeLuceneQuery(IndexQueryHandler.scala:193) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.IndexQueryHandler.processStatement(IndexQueryHandler.scala:122) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at com.stratio.cassandra.lucene.IndexQueryHandler.process(IndexQueryHandler.scala:101) ~[cassandra-lucene-index-plugin-3.11.1.0.jar:na]
        at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) ~[apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517) [apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) [apache-cassandra-3.11.3.jar:3.11.3]
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_51]
        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) [apache-cassandra-3.11.3.jar:3.11.3]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.11.3.jar:3.11.3]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
mark1ewis commented 6 years ago

Additional info-- these steps also reproduce the error when running Cassandra 3.11.1 with Cassandra Lucene Index version 3.11.1.0, and with 3.11.0/3.11.0.0, and with 3.7/3.7.4. So it's been there a long time, at least in the 3.x series. I cannot reproduce the issue with Cassandra 3.0.15 and Cassandra Lucene Index 3.0.15.0.