instaclustr / cassandra-lucene-index

Lucene based secondary indexes for Cassandra
Apache License 2.0
34 stars 18 forks source link

Cassandra 5.0 confusion #30

Open SetoKaiba opened 4 weeks ago

SetoKaiba commented 4 weeks ago

For applications requiring advanced indexing features, such as full-text search or geospatial queries, users can consider external integrations, such as OpenSearch®, that offer numerous full-text search and advanced analysis features.

This plugin will not upgrade for 5.0. I have some confusion. How does it integrate OpenSearch? Thank you. I didn't find Cassandra integration for OpenSearch.

mo-ansari99 commented 2 weeks ago

Hey @SetoKaiba

In the absence of support for the Cassandra Lucene Index in Cassandra 5.0, OpenSearch is a viable alternative for advanced search capabilities like full-text search and geospatial support. You can find more details on OpenSearch's full-text search capabilities here.

Integration of Cassandra with OpenSearch Although there isn't a direct plugin for integrating Cassandra with OpenSearch, you can achieve this integration through a data pipeline tool or a custom ETL process. Here are some common approaches:

Apache Kafka: Use Kafka to capture changes in Cassandra and index the relevant data into OpenSearch.

Cassandra CDC: Utilize Cassandra's Change Data Capture (CDC) feature to capture data changes. Implement a CDC consumer that processes these changes and indexes them into OpenSearch.

Modify your application logic to write data to Cassandra and OpenSearch simultaneously, ensuring both data stores are in sync.

Understandably, these approaches are more complex than using the Cassandra Lucene Index. However, Storage-Attached Indexing (SAI) in Cassandra 5.0 brings significant search capabilities, and there are more capabilities planned with subsequent phase releases of SAI.