SpineEventEngine / mc-java

Model Compiler for Java
Apache License 2.0
0 stars 2 forks source link

[Draft] Named Queries #68

Open armiol opened 3 years ago

armiol commented 3 years ago

Pathway for the UI

Inspired by the best practices in Domain-driven design, Spine encourages the developers to build user interfaces in top of Projections. Under well-known conditions, each Projection instance is being built asynchronously upon the stream of domain Events. And when time comes, it is available for a fast querying, skipping all the numerous JOINs and DISTINCTs.

However, in data-heavy applications, users deal with the increasing amount of displayed information. What was simple and comprehensible at first may require additional filtering and summarisation down the road. Therefore, as time passes by, client-side code of some apps may become overcomplicated.

The typical scenario for a certain UI element is as follows:

  1. A view element is built on a client-side. It displays the data of a single Projection or a list of Projections by querying the server and transparently rendering the results.

  2. Someday more complexity is introduced to the view element. More complex Projections have to be built on the server-side to keep the data available within a single query. While keeping up the Projections up-to-date eats more CPU time, the data can still be fetched from the server within a single query.

  3. The amount of data increases even more. To display it conveniently on the UI, client adds even more criteria when fetching the data from server. Maybe, even introducing a faceted search or some grouping. Under these circumstances it becomes inefficient to build Projections for each combination of the parameters displayed in the UI. Therefore, the client code starts to send several queries and combine their results on-the-go.

In this scenario, the code at steps 1 and 2 is clearly testable: a state of Projections is tested via BlackBox, the client-side code is tested for proper UI rendering and interactions.

However, when moving to step 3 things become different. The business logic of building the UI is now spread between the client- and server-side code. It is no longer possible to test a single scenario without involving both sides into the test suite. Such integration tests are significantly more difficult to run and maintain. Also, the client-side code (especially, the one in JS) may not be as strongly typed as the server-side code, and thus more prone to errors.

A typical workaround here would to create a server-side layer between the Bounded Context and the client-server transport. In this way, the business logic stays on the server-side. However, there are issues with that, too.

Named Query

New use cases require better tooling. In scope of this issue a new concept is introduced: Named Queries.

They are designed to achieve the following goals:

  1. Provide a language extension for complex read-side views exposed by a Bounded Context.
  2. Code-generate the "boring" building blocks in favour of implementing the same things over and over on a client-side.
  3. Make building of the views testable via the BlackBox.
  4. Make named queries served via the existing QueryService.

How to use it

Declaration

Named Queries are declared as messages in Protobuf. Similar to commands, we introduce a convention to treat the files ending in queries.proto as such containing the definitions of Named Queries:


// com/acme/backlog_queries.proto:

// Queries for Issues which creation date is in the date range.
// 
// The resulting set of issues is grouped by the milestone to which each issue belongs.
message IssuesPerRange {

    // The first nested `message` is intepreted as a type of the inbound parameter.
    message Param {

        // The start of the date range, inclusive.
        LocalDate start = 1;

        // The end of the date range, exclusive.
        LocalDate end = 2;
    }

    // The second nested `message` is counted as a type of the query result. 
    message Result {
        // option (async) = true;       // option to tell the results are fed asynchronously.
        // option (repeated) = true;    // tells there may be many results of this type.

        repeated IssuesOfMilestone issues = 1;
    }
}

Code generation

The framework's code generation processes it into an abstract query handler:

public abstract class IssuesPerRangeQueryHandlerBase<IssuesPerRange.Param, IssuesPerRange.Result> 
                                                                          extends QueryHandler {

    /**
     * Executes the query, returning a single {@code Result} by the given {@code Param}.
     */
    public abstract Result perform(Param parameter, QueryContext context) {...}

    ///// Other possibilities:

    // With the `option (repeated) = true;`
    public abstract Iterator<Result> perform(Param parameter, QueryContext context) {...}

    // With the `option (async) = true;`
    public abstract void perform(Param parameter, AsyncResult<Result> observer, QueryContext context) {...}

    // With the `option (async) = true; option (repeated) = true;`
    public abstract void perform(Param parameter, StreamObserver<Result> observer, QueryContext context) {...}
}

where

The users of the framework then are able to extend the IssuesPerRangeQueryHandlerBase, filling the perform(..) methods with the actual query processing:

final class IssuesPerRangeQueryHandler extends IssuesPerRangeQueryHandlerBase {

    @Override
    public abstract Result perform(Param parameter, QueryContext context) {
        //
        return result;
    }
}

From the conceptual perspective, such a handler is a Domain Service on a Query side of an application.

Executing intermediate EntityQuery

Spine also introduces a QueryHandler which is the base type for all handlers of Named Queries. Its API allows to execute Entity Queries, so that a concrete query handler could combine the output of intermediate Entity Queries into the final result:

public abstract class QueryHandler {    

    // ...

    /**
     * Executes the given Entity Query in scope of the enclosing Bounded Context 
     * and returns the iterator over the results.
     *
     * @param <S> the type of the entity state which is being queried
     */
    protected final <S extends EntityState<?>> Iterator<S> execute(EntityQuery<?, S, ?> query) {...} 

}

// Generated by Spine Compiler
public abstract class IssuesPerRangeQueryHandlerBase<IssuesPerRange.Param, IssuesPerRange.Result> 
                                                                          extends QueryHandler {..}

final class IssuesPerRangeQueryHandler extends IssuesPerRangeQueryHandlerBase {

    @Override
    public abstract Result perform(Param parameter, QueryContext context) {
        IssueView.Query query = 
               IssueView.query()
                                    .whenCreated().isGreaterOrEqualTo(parameter.getStart())
                                    .whenCreated().isLessThan(parameter.getEnd())
                                    .build();

        Iterator<IssueView> iterator = execute(query);              

        Result result = groupByMilestones(iterator);                        
        return result;
    }
}

Registration in BoundedContext

Instances of Named Query handlers should be registered in the respective BoundedContext:

final class IssuesPerRangeQueryHandler extends IssuesPerRangeQueryHandlerBase {..}

// ...

QueryHandler issuesPerRangeHandler = new IssuesPerRangeQueryHandler();
BoundedContext
    .singleTenant("Issues")
    // ...
    .register(issuesPerRangeHandler);

Exposure via QueryService

TODO: discuss this matter one more time.

At the moment we have a single endpoint in the QueryService:

// A service for querying the read-side from clients.
service QueryService {

    // Reads a certain data from the read-side by setting the criteria via Query.
    rpc Read(Query) returns (QueryResponse);
}

It's really difficult to re-use the current Query and QueryResponse types, as the resulting values may not be Entities. Therefore, we'll probably have to introduce one more endpoint:

// A service for querying the read-side from clients.
service QueryService {

    // ... — this one we have already.
    rpc Read(Query) returns (QueryResponse);

    // A newly introduced endpoint.
    rpc Read(NamedQuery) returns (NamedQueryResponse);
}

Still to discuss:

armiol commented 3 years ago

@dmitrykuzmin @dmdashenkov @yuri-sergiichuk PTAL at the draft of the new feature for 2.x we have discussed with @alexander-yevsyukov.

dmdashenkov commented 3 years ago

Looks cool. I'd use Protrobuf services instead of messages:

service IssuesPerRange {
    rpc Between(TimeSpan) returns (stream IssuesOfMilestone);
}

message TimeSpan {
    LocalDate start = 1;
    LocalDate end = 2;
}

The generated Java code would look like this:

final class IssuesPerRangeQueryHandler extends IssuesPerRangeQueryHandlerBase {

    IssuesPerRangeQueryHandler(QueryService queryService) {
        super(queryService); // Inject `QueryService` to execute the queries.
    }

    @Override
    public void perform(Param parameter, QueryContext context, StreamObserver<IssueView> resultObserver) {
        // Execute the query.
    }
}

This way we avoid most of the hidden knowledge about the way we declare such queries. No conventions for first message declaration, no message types to wrap collections, etc.

Also, this may lead us to expose named queries as separate endpoints, be it gRPC or HTTP, which might not be the worst idea.

yuri-sergiichuk commented 3 years ago

@armiol @dmdashenkov great ideas!

The only suggestion I'd add is to allow developers to use some friendlier approach than writing StreamObservers.

From my experience, we end up writing 2-3 more-or-less similar custom StreamObserver abstractions in every project. I don't want to write any examples right away, but maybe we should consider at least checking smth like RxJava or Reactor?

armiol commented 3 years ago

@dmdashenkov

The idea is to have an extension to the Ubiquitous Language, so that a Named Query is a part of a Bounded Context. I think a message defined inside a context looks much more like a query than a service which kind of lays outside of the context. However, I understand your "hidden knowledge" concern.

We also don't want to make an impression that a user-defined gRPC service will be exposed from some Bounded Context. As it will not. We want to be in control of the data structures travelling back and forth — any request should come with actor context and other attributes. Any response should also have a "compartment" on carrying the failure report. That's the reason behind the intention to incorporate the handling of such queries into our QueryService. In this way, we keep the API strict and uniform for our client libraries in all platforms — as they are able to take care of supplying the context for the request and handling the response in a human-friendly way.

Recently Chromium team has declared an intent to remove the support of HTTP/2 Server Push — as nobody uses it. If accepted, that would ultimately put an end to our hopes to interact with the browser JS via gRPC someday. Also, it means we'll have to define (or code-generate) smth like Servlets for any gRPC service we use. To me, it's not really convenient for Spine users to define a "servlet" per Named Query they have. That's another point in favour of decision to deny users defining their own gRPC services.

While we are at it, after reading about the Server Push EOL I feel we might want to migrate off gRPC for our front-end interactions at all. Therefore, I'd keep the front-facing API as brief as possible. So that any changes we would make to it — e.g. QueryService — are local to the framework internals.

armiol commented 3 years ago

@yuri-sergiichuk we might have our own "StreamObserver"s eventually. However, while we are sitting with gRPC on a side, I don't see any reason not to use it. Also, that is a low-level abstraction. In our client libraries we already have (JavaScript) or might have other means to deal with the query and subscription results.

As to the server-side API, I agree we could façade it someday. But I think any alternative type will mostly repeat the essence of StreamObserver.

dmdashenkov commented 3 years ago

My suggestion regarding a separate endpoint does not imply that we should "stick out" named queries as is. Rather, I thought of a separate, generated layer of gRPC services/servlets, which would still allow us to keep the gate, at the same time overlaying the burden of packing and unpacking intermediate types, such as NamedQuery and NamedQueryResponce. This is a part of my long-lasting desire to generate query endpoints per projection state, for the same reason — reducing the responsibilities of client libraries, our own, or, eventually, developed by users. I think this is a thing to consider and maybe discuss. I'm not set on finalizing this, at least any time soon.