spring-projects / spring-data-cassandra

Provides support to increase developer productivity in Java when using Apache Cassandra. Uses familiar Spring concepts such as a template classes for core API usage and lightweight repository style data access.
https://spring.io/projects/spring-data-cassandra/
Apache License 2.0
381 stars 311 forks source link

Allow keyspace customization for CassandraPersistentEntity #921

Closed spring-projects-issues closed 3 months ago

spring-projects-issues commented 4 years ago

lmcdasi opened DATACASS-751 and commented

In the data stax 4.5 manual it is indicated that creating session/keyspace (https://docs.datastax.com/en/developer/java-driver/4.5/manual/core/) is an anti-pattern:

"// Anti-pattern: creating two sessions doubles the number of TCP connections opened by the driver"

You have a solution that uses the AbstractRoutingSessionFactory that just implement that anti-pattern.

But the data stax driver provides actually two different solutions to avoid having that antipattern.

  1. use the setKeypace when creating a Statement - but this is OK only if the version supported if > V4. Which would not be available for all.
  2. use the built-in QueryBuilder queries that have the keyspace as a parameter. Example: looking at the QueryBuilder.insertInto - it could be changed to use the keyspace since the insertInto signature allows for a a keyspace & a table name. The Statement builder will then build the query by having the keyspace + "." + tablename.

 

The impacted classes are CassandraTemplate & StatementFactory. I attach example code of above where I test the insertTo successfully.

At the same time, while the framework allows me in CassandraConfig to specify a bean to create a cassandraTemplate that can be different than the default CassandraTemplate it does not allow me to re-use the existing framework classes in order to implement my self a template & statement factory where I can specify the keyspace without being forced to copy them out of the framework.

The keyspace can be kept in a ThreadLocal - using the same principle used for AbstractRoutingSessionFactory .

If we cannot have a template as such in the framework, can we at least have the possibility to implement a CassandraTemplace without being forced to copy a lot of code in order to access the existing framework classes ? The template requires quite a few classes that are private.

This assumes that the keyspace schema are the same.

Thank you


Affects: 3.0 RC1 (Neumann)

Attachments:

1 votes, 4 watchers

spring-projects-issues commented 4 years ago

Mark Paluch commented

A key principle in Spring's connection handling, regardless of whether it's JDBC or Cassandra, is to not mutate the connection state, especially regarding the database/keyspace. Each mutation requires tracking to revert the state into its original state once an operation has completed. Additionally, setKeyspace requires a server roundtrip. Therefore, we refrain from using setKeyspace.

Note that keyspaces may be distributed across different Cassandra clusters, hence it's not possible to implement keyspace switching in such case via the built-in features.

Recently, we upgraded to the Cassandra 4 driver and extended QueryOptions to consider the DriverExecutionProfile. Keyspace selection could be implemented in a similar approach. QueryOptions and CassandraAccessor could get associated with a keyspace to provide support for various programming models so that the keyspace gets set on each Statement.

Applications may implement SessionFactory and issue a setKeyspace invocation themselves before returning a Session

spring-projects-issues commented 4 years ago

lmcdasi commented

There are models where the keyspaces will have the same schema. And my objective is to have one Session (or maybe more) that I can re-use for multiple keyspaces. At the same time I cannot use a version > V4 thus setKeyspace is not useful for me.

Also, I want to be able to use @Repository.

 

Thus the query must be of form : 'insert into keyspace.tablename .....'  which I can execute via any session.

I do not see how QueryOptions / DriverExecutionProfile can reset the keyspace on the session or can pre-build this type of query. Do you mind giving more hints ?

It is the StatementBuilder that pre-builds the query by calling:

StatementBuilder\ builder = getStatementFactory().insert(entityToUse, options, source.getPersistentEntity(), tableName);

which translates to:

StatementBuilder\ builder = StatementBuilder.of(QueryBuilder.insertInto(tableName).....

BUT the QueryBuilder do have the exact same methods with a keyspace !

/* Starts an INSERT query for a qualified table. / @NonNull public static InsertInto insertInto( @Nullable CqlIdentifier keyspace, @NonNull CqlIdentifier table)

{ return new DefaultInsert(keyspace, table); }

Which applies to all queries.

 

 

 

spring-projects-issues commented 4 years ago

lmcdasi commented

If we read the data stax manual: https://docs.datastax.com/en/developer/java-driver/4.5/manual/core/

"By default, a session isn’t tied to any specific keyspace. You’ll need to prefix table names in your queries. You’ll need to prefix table names in your queries: ...."

So again, I understand that in order to use @Repository it may make sense to tight a keyspace to a cqlsession - but that comes with a loss of functionality.

 

spring-projects-issues commented 4 years ago

Mark Paluch commented

Looking at the QueryBuilder API, we could introduce an additional getKeyspace method on CassandraPersistentEntity to allow keyspace customization and pass the keyspace on to QueryBuilder. Feel free to submit a pull request

spring-projects-issues commented 4 years ago

Tomasz Lelek commented

Hello Mark Paluch, do you need help with this ticket?

I can work on it and submit PR. Just to be clear on what was decided:

  1. CassandraPersistentEntity should have new methods - setKeyspace() and getKeyspace()
  2. It should work similarly to existing getTableName() and callers that are using getTableName() should now use getKeyspace() as well, i.e:
SimpleStatement insertQuery = getStatementFactory().insert(entity, options, persistentEntity, persistentEntity.getTableName()).build();

will become

SimpleStatement insertQuery = getStatementFactory().insert(entity, options, persistentEntity, 
persistenEntity.getKeyspace(), persistentEntity.getTableName()).build();

propagating it to the QueryBuilder.insertInto, selectFrom, etc calls. Similar changes for all usages.

If yes, then I have two additional questions:

  1. Where is the main place in code where setKeyspace should be invoked? Is this need to be propagated from the clients, then to Mapping and set in the CassandraMappingContext#processMappingOverrides?
  2. What about the backward compatibility of adding a new setKeyspace method to the CassandraPersistentEntity interface?  From what I see, the CassandraPersistentEntity is a public interface, and adding a new method declaration may break clients. Should we provide a default method that falls back to not taking keyspace into account?
spring-projects-issues commented 4 years ago

Mark Paluch commented

We haven't decided on an approach yet. Introducing new table coordinate elements is always subject to research and the actual implementation happens on the way to getting it done.

We have multiple aspects that play into this ticket:

Let's tackle the simple things first. We should consider the keyspace information optional as it wasn't there previously. CassandraPersistentEntity is an internal interface that isn't intended to be implemented code other than Spring Data so we're safe to change it. We don't have any means to configure the keyspace on the actual entity class as @Table accepts only the table name. We've learned that pinning the schema name/keyspace name to entities isn't the best thing to do as those may change in production and folks are locked in without a chance to update the keyspace name afterwards.

For Spring Data JDBC, were solved the schema name resolution through the NamingStrategy which gets called with the PersistentEntity object. I'd advise following the same pattern as NamingStrategy can be configured by users on CassandraMappingContext and they can resolve keyspace names for the actual entity. The step to e.g. letting applications implement their own annotations and look up application-specific annotations is a small one and a suggested pattern to customize Spring/Spring Data behavior.

Does that help to come up with a first draft?

spring-projects-issues commented 4 years ago

Tomasz Lelek commented

Regarding the:

For Spring Data JDBC, were solved the schema name resolution through the NamingStrategy which gets called with the PersistentEntity object. I'd advise following the same pattern as NamingStrategy can be configured by users on CassandraMappingContext and they can resolve keyspace names for the actual entity. 

So are you suggesting creating a new interface (similar to NamingStrategy) with only one property? (I.e. KeyspaceName)? Or do you prefer to add the additional keyspace field to the existing NamingStrategy?

Also, regarding the:

The step to e.g. letting applications implement their own annotations and look up application-specific annotations is a small one and a suggested pattern to customize Spring/Spring Data behavior. 

I am not sure why do you refer to annotations here: is this a part that should not be implemented in the SDC, but only end-users can use annotations as one of the mechanisms to create a proper KeyspaceName and set it on the CassandraMappingContext? (It will work similarly to CassandraMappingContext#setNamingStrategy)

 

spring-projects-issues commented 4 years ago

Mark Paluch commented

We introduced with version 3.0 a NamingStrategy interface that we should extend. Adding a default interface method retains backward compatibility, as NamingStrategy can be implemented by application code. By default, we would return Optional.<CqlIdentifier>empty().

I mentioned annotations as extension point for users. Indeed, it is not related to the code changes, however it makes sense to consider how users would go about keyspace customizations. One way would be a static definition for all entities. Other implementations may leverage a mapping. Since we pass in CassandraPersistentEntity, the NamingStrategy.getKeyspaceName() would have also access to the underlying entity class and an application could leverage own annotations as an additional mechanism for keyspace configuration

mp911de commented 3 months ago

Some good things take a bit longer until we get there. We introduced options to configure the keyspace for @Table and @UserDefinedType to be applied for schema actions and data interaction. Introducing TableCoordinate (see #179) is a too broad brush in terms of the API as the coordinates could be either the name or name and keyspace. We would break almost all signatures accepting the table name (or have to create overrides cluttering the API).

For now, we want to allow specific entities to be qualified for a specific keyspace. By configuring a KeyspaceProvider through StatementFactory we also allow further customization options.

Special honors go to @tomekl007 and @mipo256 for their contributions.