Hi all,
My team use Apache cassandra and cassandra-lucene-indexing for production.
Before, we use Apache Cassandra lucene 3.7.0 and cassandra-lucene-indexing 3.7.0. It was good, no problems performance.
After, we upgrade Cassandra to 3.11.2 and use cassandra-lucene-indexing 3.11.1. As you mention, both are compatible.
Issue happen at here. Sorting on Cassandra DB is no good. It is slow about 10x times.
Environment:
- Nodes: 5
- 8 cores CPU, 32 GB RAM
- Vnode, 256 tokens.
- Replicate factor: 2
- Total rows for indexing: 2M.
CREATE TABLE event (
pk1 bigint,
pk2 text,
pk3_clustering timestamp,
col1 text,
col2 text,
col3 text,
lucene text,
PRIMARY KEY ((pk1, pk2), pk3_clustering)
) WITH CLUSTERING ORDER BY (pk3_clustering DESC)
AND bloom_filter_fp_chance = 0.1
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'sstable_size_in_mb': '160', 'tombstone_compaction_interval': '86400', 'tombstone_threshold': '0.1'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 3110400
AND gc_grace_seconds = 14400
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX event_index ON event (lucene) USING 'com.stratio.cassandra.lucene.Index' WITH OPTIONS = {'refresh_seconds': '60', 'indexing_threads': '0', 'schema': '{
fields: {
pk3_clustering: {type: "date", pattern: "yyyy/MM/dd HH:mm:ssZ"},
pk2: {type: "string", case_sensitive: false},
pk1: {type: "bigint"},
col1: {type: "string", case_sensitive: false},
col2: {type: "string", case_sensitive: false},
col3: {type: "string", case_sensitive: false}
}
Performance:
CQL query
SELECT pk1 FROM event WHERE expr(event_index, '{"sort":{"fields":[{"type":"simple","field":"pk3_clustering","reverse":true}]}}') limit 100;
Hi all, My team use Apache cassandra and cassandra-lucene-indexing for production. Before, we use Apache Cassandra lucene 3.7.0 and cassandra-lucene-indexing 3.7.0. It was good, no problems performance. After, we upgrade Cassandra to 3.11.2 and use cassandra-lucene-indexing 3.11.1. As you mention, both are compatible.
Issue happen at here. Sorting on Cassandra DB is no good. It is slow about 10x times.
Environment:
- Nodes: 5
- 8 cores CPU, 32 GB RAM
- Vnode, 256 tokens.
- Replicate factor: 2
- Total rows for indexing: 2M.
Performance:
CQL query
Cassandra version: 3.7.0, Cassandra-lucene-indexing: 3.7.0
Processing time: 300 ms
Cassandra version: 3.11.2, Cassandra-lucene-indexing: 3.11.1
Processing time: 3366 ms
Tracing
I see a abnormal trace logs on the node run CQL query:
Is it root cause ? Because I don't see the log on old version (3.7.0).