Open junaidnasir opened 6 years ago
It's a bad query
SELECT * FROM alldev.temp WHERE expr(idx, '{ filter: { type: "range", field: "value", lower: 0, include_lower: true } }');
I also found this problem.if my lucene column is not empty,then can return results using spark.so I only can put some unused value to this column.Have any other resolutions?
We have been really excited by the potential of using the Lucene indexing provided by stratio. We have an IoT platform that is ingesting a timeseries data in to a C* cluster for some time (the 3 node cluster is now having around 1TB of data).
Initially we had some latency issues when querying around our sensor db for windowed operations. Even if we were able to generate keys (per day) that were time-based, the performance over spark SQL was not turning out to be independent of the size of the total data stored.
We now have Lucene indexing enabled on top of it, and direct (cql based) queries to the DB, of the type we are expecting, are extremely fast.
We use the datastax connector that requires the hack for an empty column to be added. However we see that while the query is appropriately filtered by Lucene indexing, Spark gets the data and apparently disregards it and returns a null. same issue as #79 . the thread said it was fixed in 1.6.0 but apparently it still exists in 2.1.0. any help to resolve it would be highly appreciated
Using Lucene: 3.11.0.0 cassandra:3.11.0 datastax:spark-cassandra-connector:2.0.3-s_2.11 spark: 2.1.1
spark plan for same query is shown below, i think problem is filter isnotnull(lucene#3)