Stratio / cassandra-lucene-index

Lucene based secondary indexes for Cassandra
Apache License 2.0
600 stars 170 forks source link

Sorting performance is down after upgrade to 3.11.1 version. #392

Open PhiVanTran opened 6 years ago

PhiVanTran commented 6 years ago

Hi all, My team use Apache cassandra and cassandra-lucene-indexing for production. Before, we use Apache Cassandra lucene 3.7.0 and cassandra-lucene-indexing 3.7.0. It was good, no problems performance. After, we upgrade Cassandra to 3.11.2 and use cassandra-lucene-indexing 3.11.1. As you mention, both are compatible.

Issue happen at here. Sorting on Cassandra DB is no good. It is slow about 10x times.

Environment:

- Nodes: 5

- 8 cores CPU, 32 GB RAM

- Vnode, 256 tokens.

- Replicate factor: 2

- Total rows for indexing: 2M.

CREATE TABLE event (
    pk1 bigint,
    pk2 text,
    pk3_clustering timestamp,
    col1 text,
    col2 text,
    col3 text,
    lucene text,
    PRIMARY KEY ((pk1, pk2), pk3_clustering)
) WITH CLUSTERING ORDER BY (pk3_clustering DESC)
    AND bloom_filter_fp_chance = 0.1
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'sstable_size_in_mb': '160', 'tombstone_compaction_interval': '86400', 'tombstone_threshold': '0.1'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 3110400
    AND gc_grace_seconds = 14400
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX event_index ON event (lucene) USING 'com.stratio.cassandra.lucene.Index' WITH OPTIONS = {'refresh_seconds': '60', 'indexing_threads': '0', 'schema': '{
    fields: {
     pk3_clustering: {type: "date", pattern: "yyyy/MM/dd HH:mm:ssZ"},
     pk2: {type: "string", case_sensitive: false},
     pk1: {type: "bigint"},
     col1: {type: "string", case_sensitive: false},
     col2: {type: "string", case_sensitive: false},
     col3: {type: "string", case_sensitive: false}
    }

Performance:

CQL query

  SELECT pk1 FROM event WHERE expr(event_index, '{"sort":{"fields":[{"type":"simple","field":"pk3_clustering","reverse":true}]}}') limit 100;

Cassandra version: 3.7.0, Cassandra-lucene-indexing: 3.7.0

Processing time: 300 ms

Cassandra version: 3.11.2, Cassandra-lucene-indexing: 3.11.1

Processing time: 3366 ms

Tracing

I see a abnormal trace logs on the node run CQL query:

 Sending REQUEST_RESPONSE message to /172.31.26.147 [MessagingService-Outgoing-/172.31.26.147-Small] | 2018-06-22 04:03:44.243000 | 172.31.26.153 |          39280 | 127.0.0.1
Lucene post-process 14813 collected rows to 100 rows [Native-Transport-Requests-1] | 2018-06-22 04:03:47.564000 | 172.31.26.147 |        3366212 | 127.0.0.1
Request complete | 2018-06-22 04:03:47.565067 | 172.31.26.147 |        3367067 | 127.0.0.1

Is it root cause ? Because I don't see the log on old version (3.7.0).