spring-projects / spring-data-cassandra

Provides support to increase developer productivity in Java when using Apache Cassandra. Uses familiar Spring concepts such as a template classes for core API usage and lightweight repository style data access.
https://spring.io/projects/spring-data-cassandra/
Apache License 2.0
379 stars 311 forks source link

SchemaAction: Support for adding/removing columns [DATACASS-187] #358

Open spring-projects-issues opened 9 years ago

spring-projects-issues commented 9 years ago

Jens Rantil opened DATACASS-187 and commented

Currently Spring Data Cassandra allows creation and recreation of keyspaces on startup (SchemaAction). Spring Data JPA also supports this but on a column level. That is, on startup is issues

ALTER TABLE mytable ADD thenewcolumn TYPE

on startup.

I'm not sure they support removal of columns, and I leave that up to the implementer whether to implement in this ticket or not


Issue Links:

spring-projects-issues commented 8 years ago

John Blum commented

SD Cassandra does not currently support automatic addition/removal of columns, based on the current definition of the persistent entity associated with the table requiring an update, using a Cassandra ALTER TABLE CQL (DDL) statement. Part of the reason is, the existing code treats all table (QQL DDL) operations (i.e. "schema actions") based on the select entities as a create.

However, it is not to difficult to imagine (or even implement for that matter) updating a table based on the current persistent entity (code) definition, especially given the admin-based template supports the [alterTable(..)](https://github.com/spring-projects/spring-data-cassandra/blob/1.4.1.RELEASE/spring-data-cassandra/src/main/java/org/springframework/data/cassandra/core/CassandraAdminTemplate.java#L74-L77) operation (and more specifically, here, though technically, here). It is really akin to this, and in particular, this.

All of this is to say, I think it is easily doable, though testing will be quite extensive.

Currently, you could/should be able to accomplish the same thing using a SchemaAction of [RECREATE](https://github.com/spring-projects/spring-data-cassandra/blob/1.4.1.RELEASE/spring-data-cassandra/src/main/java/org/springframework/data/cassandra/config/SchemaAction.java#L35-L38). Though, this is not as convenient and certainly will have an impact to existing data.

It is actually this last point ("existing data") that poses a problem along with several other considerations that must be decided carefully....

  1. First, properly handling the presence of existing data (to prevent accidental data loss, especially in critical environments).

NOTE: the SD Cassandra SchemaAction feature does not discriminate based on context and it would be all too easy for a developer to forget to change the schema action using Spring Profiles.

  1. While removals are easier to implement, additions (remove/re-add) and alterations (changes to existing fields) require more careful attention.

Consider a persistent entity type (e.g. Person) where a new field (i.e. "persistent property"; e.g. birthDate) is added...

@Table("People")
class Person {

  @NotNull
  String firstName;

  @NotNull
  String lastName;

  @NotNull
  Timestamp birthDate;
}

That would roughly translate to the following CQL DDL (assuming Cassandra has column constraints much like a RDBMS)...

ALTER TABLE applicationKeyspace.people {
  ADD birth_date TIMESTAMP NOT NULL
}

Of course, in an RDBMS, constraints can be disabled while modifying the structure of a table, or defaults can be provided. But, this may not be practical in all cases.

  1. In addition, changing the type of an existing field/property of a persistent entity is not trivial and may require some sort of conversion for existing data. Additionally, Cassandra also uses a different CQL DDL statement ([ALTER TYPE](https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlRefAlterType.html)) for field type changes.

  2. Field name changes based on persistent property must also be handled properly, though is easy to accomplish in CQL DDL...

RENAME old_column_name TO new_column_name

But, to accomplish all of this, the existing table definition meta-data needs to be acquired and compared with the persistent entity to determine the changes.

At any rate, it will take a bit more thought to design a reliable and elegant/maintainable solution. Simpler will be better initially and we can round this capability out more as we explore this feature more