Stratio / cassandra-lucene-index

Lucene based secondary indexes for Cassandra
Apache License 2.0
600 stars 171 forks source link

Reducing the speed of the insertion, when the merge index files #354

Open ArgDS opened 7 years ago

ArgDS commented 7 years ago

I have four tables, three of them have a lucene index. After writing 4 million records in each table, the problems started with the merger of the lucene index. Lucene threads to merge files takes all the CPU time and the write speed of Cassandra is reduced to a few requests per second. I have a cluster of five nodes. Version cassandra - 3.9. Version plugin - 3.9.6. My node configuration:

Data and indexes are on different drives. The replication factor for the keyspace is 3. General settings for indexes:

The division of the partition by column. Partition located on a separate disk. For each index, at 128 partitions. I was expecting a slight drop in speed when using the lucene index up to several thousand records per second, but received an average of about 50 - 70 records per second.

What optimization can be done in the settings of the index or cassandra to improve the write speed of up to several thousand per second when using your plugin?

jpgilaberte commented 7 years ago

Hi @ArgDS,

Thank you for your interest in the project. I'll try to answer you.

Force segments merge:

JMX interface allows you to force a complete index segments merge. This is a very heavy operation similar to C * compaction that can significantly improve performance. Although this operation is not mandatory at all, you should consider using it if your system has off-peak hours that can be used for optimization tasks. The ideal scenario is to have all the index in a single segment.

https://github.com/Stratio/cassandra-lucene-index/blob/branch-3.0.14/doc/documentation.rst#force-segments-merge

Hope this helps Regards