Add a QueryPostProcessor interface allowing users to receive a callback with the query generated from a Repository query method [DATACMNS-1268]

John Blum opened DATACMNS-1268 and commented

In Spring Data GemFire, users have requested, and have even suggested/submitted a PR to include the ability to "customize" a query that is generated from using the Repository abstraction and following the convention, i.e. declaring interface query methods that generate data store specific queries (e.g. in GemFire, OQL) based on the Repository query method signature.

However, the utility of such a feature to affect such queries extends beyond the capabilities and context of just SD GemFire/Geode, and would in fact, be useful and beneficial for any data store extending and supporting the Spring Data Commons Repository abstraction.

For example, defining implementations of the QueryPostProcessor interface and declaring/registering them in the Spring context as proper beans, a user could...

1) Log all application Repository queries derived (generated and manual, using, e.g. a @Query annotation or a "named query" defined in an appropriate application properties file as specified by SD) from query methods signatures in a technology agnostic manner.

2) Gather metrics or record additional auditing information about the executed queries in an application at runtime.

While frameworks Hibernate allow for the "actual" query to be logged or output, a "logging" QueryPostProcessor implementation would offer a wider range of control. For instance, it could tie into Spring Boot's Actuator to provide additional lower-level metrics that are not possible to acquire from Actuator itself.

3) Implement internal, query language extensions (e.g. Query Hints, Tracing or in SDG's case "Query Imports", etc) in the data store specific, SD modules (e.g. Spring Data GemFire).

4) Modify or return a new query in place of the generated query. Related, but slightly different, would be to add additional optimizations, limitations or other constraints on the query at runtime, perhaps only in certain contexts, using Spring profiles, perhaps.

5) This would also enable developers who are not in control of the library which might provide application Repository interfaces for a certain domain to be able to affect the queries that the provided Repository interfaces define. For example, if a I have "Ordering System" with Products, Orders and LineItems, as an application developer using this library, I may want adjust the queries in the OrderRepository, such as set a LIMIT, or specify other criteria to further qualify a query, that has not be provided by the owners of the library.

Anyway, I am sure we can think of many other use case for such a feature.

Currently the interface is defined like so...

package ...;

import ...;

@FunctionalInterface
@SuppressWarnings("unused")
public interface QueryPostProcessor<T extends Repository, QUERY> extends Ordered {

    Object[] EMPTY_ARGUMENTS = new Object[0];

    /**
     * Defines the {@link Integer order} of this {@link QueryPostProcessor} relative to
     * other {@link QueryPostProcessor QueryPostProcessors} in a sort.
     *
     * Defaults to the {@link Ordered#LOWEST_PRECEDENCE}.
     *
     * @return an {@link Integer} value specifying the {@link Integer order} of this {@link QueryPostProcessor}
     * relative to other {@link QueryPostProcessor QueryPostProcessors} in a sort.
     * @see org.springframework.core.Ordered#getOrder()
     */
    @Override
    default int getOrder() {
        return Ordered.LOWEST_PRECEDENCE;
    }

    /**
     * Callback method invoked by the Spring Data (SD) {@link Repository} framework to allow the user to process
     * the given {@link QUERY query} and (possibly) return a new or modified version of the {@link QUERY query}.
     *
     * This callback is invoked for {@literal queries} generated from the SD {@link Repository} {@link QueryMethod}
     * signature as well as {@literal queries} specified and defined in {@link NamedQueries}.
     *
     * @param query {@link QUERY query} to process.
     * @return a new or modified version of the same {@link QUERY query}.
     * @see org.springframework.data.repository.query.QueryMethod
     * @see #postProcess(QueryMethod, Object, Object...)
     */
    default QUERY postProcess(@NonNull QueryMethod queryMethod, QUERY query) {
        return postProcess(queryMethod, query, EMPTY_ARGUMENTS);
    }

    /**
     * Callback used to post process the given {@link QUERY query} and return possibly a new or modified version
     * of the {@link QUERY query}.
     *
     * This callback is invoked for {@literal queries} generated from the SD {@link Repository} {@link QueryMethod}
     * signature as well as {@literal queries} specified and defined in {@link NamedQueries}.
     *
     * @param query {@link QUERY query} to process.
     * @param arguments array of {@link Object Objects} containing the arguments to the {@link QUERY query} parameters.
     * @return a new or modified version of the same {@link QUERY query}.
     * @see org.springframework.data.repository.query.QueryMethod
     */
    QUERY postProcess(@NonNull QueryMethod queryMethod, QUERY query, Object... arguments);

    /**
     * Builder method used to compose, or combine this {@link QueryPostProcessor QueryPostProcessors}
     * with the given {@link QueryPostProcessor}.
     *
     * This {@link QueryPostProcessor} will come before this {@link QueryPostProcessor} in the processing chain.
     *
     * @param queryPostProcessor {@link QueryPostProcessor} to compose with this {@link QueryPostProcessor}.
     * @return a composed {@link QueryPostProcessor} consisting of this {@link QueryPostProcessor}
     * followed by the given {@link QueryPostProcessor}.  Returns this {@link QueryPostProcessor}
     * if the given {@link QueryPostProcessor} is {@literal null}.
     * @see #processAfter(QueryPostProcessor)
     */
    @NonNull
    default QueryPostProcessor<?, QUERY> processBefore(@Nullable QueryPostProcessor<?, QUERY> queryPostProcessor) {
        return queryPostProcessor == null ? this : (queryMethod, query, arguments) ->
            queryPostProcessor.postProcess(queryMethod, this.postProcess(queryMethod, query, arguments), arguments);
    }

    /**
     * Builder method used to compose, or combine this {@link QueryPostProcessor QueryPostProcessors}
     * with the given {@link QueryPostProcessor}.
     *
     * This {@link QueryPostProcessor} will come after this {@link QueryPostProcessor} in the processing chain.
     *
     * @param queryPostProcessor {@link QueryPostProcessor} to compose with this {@link QueryPostProcessor}.
     * @return a composed {@link QueryPostProcessor} consisting of the given {@link QueryPostProcessor}
     * followed by this {@link QueryPostProcessor}.  Returns this {@link QueryPostProcessor}
     * if the given {@link QueryPostProcessor} is {@literal null}.
     * @see #processBefore(QueryPostProcessor)
     */
    @NonNull
    default QueryPostProcessor<?, QUERY> processAfter(@Nullable QueryPostProcessor<?, QUERY> queryPostProcessor) {
        return queryPostProcessor == null ? this : (queryMethod, query, arguments) ->
            this.postProcess(queryMethod, queryPostProcessor.postProcess(queryMethod, query, arguments), arguments);
    }
}

Actual Spring Data GemFire Javadoc for the QueryPostProcessor interface is available here. Actual implementation, here.

The interface provides several useful functions...

1) First, and obviously, the interface's primary purpose is to serve as a contract between developers implementing QueryPostProcessors and Spring Data Commons' Repository abstraction and infrastructure invoke a callback thereby allowing the developer to further inspect and possibly act on the Repository query method "query".

2) Since a user can define more than 1 QueryPostProcessor, then all the defined, declared and registered QueryPostProcessor implementations must be ordered in some manner, i.e. their precedence specified. This is accomplished with the extension of the core Spring Framework's org.springframework.core.Ordered interface.

3) Additionally, since a user may want to supply more than 1 QueryPostProcessor implementation, which would be useful to keep certain query processing concerns separate as well as to create a processing pipeline to orderly process 1 or more queries, then the interface provides both the processBefore(:QueryPostProcessor) and processAfter(:QueryPostProcessor) composition methods. These methods are not unlike the java.util.function.Function.andThen(:Function) and java.util.function.Function.compose(:Function) compositions methods (NOTE: java.util.function.Predicate defines similar methods), which enables users to "compose" QueryPostProcessors programmatically (or inside a Spring FactoryBean, perhaps) using the Composite Design Pattern.

4) The QueryPostProcessor is a @FunctionalInterface and therefore can be conveniently used in Lambda expressions.

5) The postProcess(:QueryMethod, :QUERY, arguments:Object[]) method allows developers to apply QueryPostProcessors as granular as they like. For example, s/he can apply QueryPostProcessors to specific QueryMethods on the application Repository interface.

6) QueryPostProcessors are typed to a Repository interface extension.

This will be used by Spring Data Commons (and is currently used by Spring Data GemFire/Geode) to "register" QueryPostProcessors with certain application Repositories.

For example, as a developer, I may want a general purpose LoggingQueryPostProcessor that logs all Repository query method queries across all my application Repositories. I might then define this LoggingQueryPostProcessor as...

class LoggingQueryPostProcessor implements QueryPostProcessor<Repository, String> {

  private Logger logger = Logger.getLogger("queryLoggerName");

  @Override
  public int getOrder() {
    return 1;
  }

  public String postProcess(QueryMethod queryMethod, String query Object... arguments) {
    this.logger.info(String.format("Executing query[%s] with arguments [%s]", query, Arrays.toString(arguments)));
  }
}

Additionally, I may want to define 1 or more QueryPostProcessors that are specific to the CustomerRepository...

class CustomerQueryPostProcessor implements QueryPostProcessor<CustomerRepository, String> {

  @Override
  public int getOrder() {
    return 0;
  }

  public String postProcess(QueryMethod queryMethod, String query, Object... arguments) {
    ...
  }
}

The CustomerQueryPostProcessor would only be registered with and post process queries from the CustomerRepository. This is currently implemented in Spring Data GemFire/Geode this way now and takes advantage of the existing QuerryCreationListener callback to handle QueryPostProcesor implementation registrations.

A few final closing thoughts on QueryPostProcessor interface...

1) Spring Data GemFire's current implementation of "Query Post Processing" behavior is to invoke the callback each and every time the query is executed and specifically, just before the query is executed.

2) The actual type of the query is generic to accommodate different representations of the data store specific query. For example, while there is a "compiled" form of a GemFire OQL query, that is represented by org.apache.geode.cache.query.Query (which is analogous to the JDBC java.sql.PreparedStatement) it is typically more common to manipulate the query in it's raw form as a java.lang.String. But then, the QueryPostProcessor interface definition does not really care since it treats the query generically, thus make it up to the individual stores, or maybe even individual developers of the store, how to best hand and "process" the query.

3) The QueryPostProcessor would not be applied to standard CRUD methods, or methods supported by either a base implementation of Repository, which are typically provided by the specific Spring Data modules (e.g. in SDG that would be the o.s.d.g.repository.support.SimpleGemfireRepository class) nor would it process any "custom" Repository implementations.

4) It would in fact still receive a callback for "manual" queries, or rather queries defined using the @Query annotation or a "named" query defined in a store specific properties file. Again, this is beneficial in cases where the user may have acquired a library with pre-defined Repository (e.g. OrderRepository) interfaces for which they have no control.

5) ???

Additional (implementation) details and examples will be provided in comments below.

Feedback welcomed.

Thank you!

Reference URL: https://jira.spring.io/browse/SGF-713

John Blum commented

By way of example, and to gain a better understanding of how the QueryPostProcessor, or rather "query post processing" logic, is implemented and gets applied/used at the individual store-level, I will share details of how I implemented this functionality/behavior in Spring Data GemFire/Geode.

ARCHITECTURE/DESIGN

First, you have to understand how the Spring Data Commons Repository abstraction and core infrastructure works across all SD modules for specific data stores. It begins in o.s.d.repository.core.support.RepositoryFactoryBeanSupport when it requests a Repository proxy implementation be created from, and based on, the user's application-defined Repository interface extension (e.g. like a CustomerRepository, which might extend CrudRepository<Customer, Long>). This call proceeds to create the Repository proxy implementation of the application-defined Repository interface.

Part of Repository proxy creation is to register the QueryExecutorMethodInterceptor that identifies and constructs store-specific RepositoryQueries from all the non-base class (e.g. o.s.d.g.repository.support.SimpleGemfireRepository implementing the standard CRUD and basic query data access operations, i.e. findAll(), findOne(), etc), non-custom, user-provided Repository implementations, but actual "query methods", using the store-provided o.s..d.repository.query.QueryLookupStrategy.

This is where it gets interesting and specific to data store, and where individual SD modules would typically inject query post processing capabilities, as I have done in SDG. Here is the implementation of SDG's QueryLookupStrategy. As you can see, SDG provides 2 implementations of o.s.d.repository.queryRepositoryQuery, the o.s.d.g.repository.query.StringBasedGemfireRepositoryQuery used here and here to handle both forms of "manual" queries, i.e. those defined with the @Query annotation or OQL queries resolved from a "named" query defined in a SD module specific properties file.

The other implementation is the o.s.d.g.repository.query.PartTreeGemfireRepositoryQuery used here which handles all query methods using the convention and thus constitute the "generated" GemFire OQL queries from the query method signature. Note, that internally, o.s.d.g.repository.query.PartTreeGemfireRepositoryQuery delegates to a o.s.d.g.repository.query.StringBasedGemfireRepositoryQuery.

This certainly made the callback easier as did the fact that both PartTreeGemfireRepositoryQuery and StringBasedGemfireRepositoryQuery extends the abstract o.s.d.g.repository.query.GemfireRepositoryQuery class. This is the class in which I supplied the QueryPostProcessors meant to handle queries for query methods in that particular Repository. This class provided the methods to register and access the QueryPostProcessors anyway.

USE

As 1 example of how I applied and used the concepts of QueryPostProcessors in SDG itself, several releases ago (1.5?) I implemented certain "OQL" query language extensions on Repository query methods using annotations. I introduced the @Limit, @Hint, @Import and @Trace annotations.

While LIMIT and TRACE have relatively simple syntax and could have been incorporated into the query method signature parsing logic, HINT and IMPORT are another story.

No matter, my previous implementation of the "handling logic" for these annotations (before I introduced QueryPostProcessors) looked like this and was applied like so.

Now, with QueryPostProcessors, I have introduced this, implemented here and applied here.

So, how do these QueryPostProcessors get picked up and registered with the individual application Repositories?

REGISTRATION

Since the o.s.d.g.repository.support.GemfireRepositoryFactoryBean is provided a reference to the Spring ApplicationContext, it was simple and natural to let developers define QueryPostProcessors as beans in the Spring context like any other beans, and discriminately register them with Repositories based on the generic type signature/arguments declared in the QueryPostProcessor implementations.

All that logic is implemented here.

The registration occurs here.

EXAMPLE

The following is an example test class I wrote for SDG.

DOCUMENTATION

I wrote the following documentation on this feature in Spring Data Geode/GemFire's Reference Guide here

spring-projects / spring-data-commons

Add a QueryPostProcessor interface allowing users to receive a callback with the query generated from a Repository query method [DATACMNS-1268] #1706