spring-projects / spring-data-cassandra

Provides support to increase developer productivity in Java when using Apache Cassandra. Uses familiar Spring concepts such as a template classes for core API usage and lightweight repository style data access.
https://spring.io/projects/spring-data-cassandra/
Apache License 2.0
379 stars 311 forks source link

Add index creation to table creation [DATACASS-213] #383

Closed spring-projects-issues closed 7 years ago

spring-projects-issues commented 9 years ago

Mark Nelissen opened DATACASS-213 and commented

Although the 1.2 version contains the @Indexed annotation, when you let Spring Data Cassandra create the table structures, it does not create the necessary secondary indexes on the annotated fields/columns.

I use the following sample code in a custom class of my project to achieve this:

CassandraPersistentEntity<?> persistentEntity = mappingContext.getExistingPersistentEntity(entityType);
persistentEntity.doWithProperties((CassandraPersistentProperty persistentProperty) -> {
    if (persistentProperty.isIndexed()) {
        operations.execute(CreateIndexSpecification.createIndex().tableName(persistentEntity.getTableName()).columnName(persistentProperty.getColumnName()).ifNotExists(true));
    }
});

I am able to do this, since my project requires to be able to pilot the creation of the database structure, which forced me to create my own class using the not-so-public CassandraAdminOperations API.

As a side note, it would be nice if the CassandraAdminOperations could be as easily instantiated with Spring XML configuration as the regular CassandraOperations (<cassandra:template/>)


Affects: 1.2 GA (Fowler)

Issue Links:

Referenced from: pull request https://github.com/spring-projects/spring-data-cassandra/pull/111

4 votes, 9 watchers

spring-projects-issues commented 9 years ago

Artem Bilan commented

I am considering this issue as a bug, because we can't use the @Indexed feature when we want to filter data with WHERE clause

spring-projects-issues commented 8 years ago

David Dossot commented

Agreed on the bug qualification. It would be better to remove org.springframework.data.cassandra.mapping.Indexed if it's useless

spring-projects-issues commented 8 years ago

Mark Paluch commented

I think it makes sense to provide index creation from Spring Data Cassandra. We already provide support for creating tables but @Indexed. It lacks support for index options but I guess it's not an issue to extend the annotation

spring-projects-issues commented 8 years ago

David Webb commented

Indexes in Cassandra are not like indexes in an RDBMS system. In general, if you need an index, there are likely 3 scenarios.

  1. You may be using Cassandra as a datastore when you don't really have "big data" use case
  2. The table is not designed properly using the partition key(s) and clustering key(s)
  3. You simply just need to create a new table that is partitioned and clustered so that it is tailored to your query <--Most likely

The original authors left this out on purpose based on real-world experience with large C* tables.

While this may look like it makes sense on a small scale for prototyping and single node C* clusters, this will likely not scale with clusters of hundreds of nodes.

Example: You add @Indexed to a column on a table with billions of rows, then start your application. The startup may take hours. Indexes on existing large tables should be created external to SDC*.

Reference: https://docs.datastax.com/en/cql/3.1/cql/ddl/ddl_primary_index_c.html https://docs.datastax.com/en/cql/3.1/cql/ddl/ddl_when_use_index_c.html

spring-projects-issues commented 7 years ago

Marc Pynaert commented

In my opinion, if Cassandra allows it, it should be supported by Spring Data Cassandra. The choice to use them wisely must in the end be left to the developper, just like everything else.

Also, note that: "In Cassandra 3.4 and later, a new implementation of secondary indexes, SSTable Attached Secondary Indexes (SASI) have greatly improved the performance of secondary indexes and should be used, if possible." (https://docs.datastax.com/en/cql/3.3/cql/cql_using/useSecondaryIndex.html)

spring-projects-issues commented 7 years ago

John Blum commented

Committed to master for the Spring Data Cassandra Kay RC1 release