eclipse / microprofile-context-propagation

Apache License 2.0
34 stars 23 forks source link

Revisit discussion of transaction context propagation #124

Closed njr-11 closed 5 years ago

njr-11 commented 5 years ago

We don't currently propagate transaction context by default. Instead, we clear (suspend) it, which allows tasks/actions to run their own transactions.

This discussion was brought up again on last week's MicroProfile Concurrency call. This position was questioned and there is a push from at least one member of the spec group (@FroMage) to make transaction context propagation the default. I'm opening this issue to formally represent the discussion and document its resolution.

I can provide my own position on the matter, which I'll document here. Those with the opposite view should feel free to append to this comment with the explanation (I can't adequately represent the opposing position, and in any case would be biased),

Why we shouldn't propagate transaction context by default:

Propagation of transactions is an advanced pattern. To use it, you need to be aware of the various limitations and capabilities of the involved resources and the transaction manager. You need to be willing to read about and follow best practices for transaction propagation just to write safe code. If you are willing to do these things, you should be willing to write code that explicitly asks for transaction context propagation to be turned on specifically where you need it. In fact, I expect that such users will prefer to be explicit about when they want to use transaction context propagation, and they actually wouldn't want it enabled by default across all usage of completion stages within the application because that would add both unnecessary risk and extra overhead.

The argument in favor of propagating transaction context by default:

(Please add here)

If no agreement can be reached

If no agreement can be reached then a possible fallback plan, which sacrifices usability of the CDI config annotations is the following:

Current defaults (between both ThreadContextConfig and ManagedExecutorConfig),

String[] cleared() default { ThreadContext.TRANSACTION };
String[] propagated() default { ThreadContext.ALL_REMAINING };
String[] unchanged() default {};
...
int maxAsync() default -1;
int maxQueued() default -1;

Would be changed to,

String[] cleared() default { ThreadContext.DEFAULT_CLEARED };       // means TRANSACTION unless MP Config says otherwise
String[] propagated() default { ThreadContext.DEFAULT_PROPAGATED }; // means ALL_REMAINING unless MP Config says otherwise
String[] unchanged() default { ThreadContext.DEFAULT_UNCHANGED };   // means empty set unless MP Config says otherwise
...
int maxAsync() default ManagedExecutor.DEFAULT_MAX_ASYNC;   // means effectively unlimited unless MP Config says otherwise
int maxQueued() default ManagedExecutor.DEFAULT_MAX_QUEUED; // means effectively unlimited unless MP Config says otherwise

After having done the above, we would make it possible for the user to override via MicroProfile config as:

ManagedExecutor/cleared=
ThreadContext/cleared=

The above would override the default for both the config annotations as well as the builders.

emmanuelbernard commented 5 years ago
  • Data integrity. There are all sorts of ways to go wrong with propagated transaction context, including the possibility that you end up running operations under a transaction that you didn't intend to, or causing premature commit/rollback. Transaction context propagation is dangerous if you don't know exactly what you are doing. It is for advanced usage patterns, so it shouldn't need to be default.

That argument you re making is invalidating the accepted usages in the current business component propagation model whether it be EJBs or CDI based Tx portable extensions. The accepted model is that at a given entry point (a bean method), transaction is started and the transaction scope is expected to be the same for subsequent (bean) method calls unless explicitly altered by a user request. Note that this model is the same for JPA context propagation.

For async context propagation you can reasonably equate a callback (or equivalent asunc primitive) to a subsequent bean method call in the blocking world.

For me that is a massive argument in favor of propagating transactions.

emmanuelbernard commented 5 years ago
  • Limited applicability. Many transaction manager and transactional resource implementations won't be able to propagate transaction context all (the very idea is scary to our own transaction manager team and they have strongly warned against doing it). Others will only be able to support it for XA-capable resources and not for one-phase resources, and even then, some types of XA-capable resources will support it whereas others will not. Even when it's the case where the XA-capable resources involved and the transaction manager do support transaction context propagation, some of them will end up limiting support to operations being performed in serial, rather than allowing the transaction to actually be used in parallel, even though the CompletionStage model certainly allows for and encourages parallel use.

I'll let my tx colleague to detail on the implementation aspect but note that whether it's the default or not, the Tx implementation will have to decide whether they allow context propagation or not. So that's a general yes / no on the platform. I don't see it being the default as fundamentally changing that.

Also, let's try and separate serial or parallel work from the notion of concurrency. There are perfectly acceptable concurrency models not requiring any parallelism as you hint as a possible "workaround" of the Tx impl.

emmanuelbernard commented 5 years ago
  • Defaults ought to reflect core usage patterns. By enabling transaction context propagation by default, applications that don't have any interest in propagating transactions but happen to make use of completion stages for non-transactional operations within a transaction will be unable to do so, because the transaction manager will need to reject propagation of the transaction. The application will then need to go out of its way to turn off transaction context propagation in order to work properly.

The argument I make replying on your data integrity point is applying here. You are claiming that the @Transactional model of Java EE is not a core usage pattern. I disagree ☺️

emmanuelbernard commented 5 years ago

Propagation of transactions is an advanced pattern. To use it, you need to be aware of the various limitations and capabilities of the involved resources and the transaction manager. You need to be willing to read about and follow best practices for transaction propagation just to write safe code. If you are willing to do these things, you should be willing to write code that explicitly asks for transaction context propagation to be turned on specifically where you need it. In fact, I expect that such users will prefer to be explicit about when they want to use transaction context propagation, and they actually wouldn't want it enabled by default across all usage of completion stages within the application because that would add both unnecessary risk and extra overhead.

Are you considering thread bound Tx a model of transaction propagation? If yes then, I disagree with you. Many many developers use Tx propagation by boundary demarcation in both the Java EE world and the Spring world. There are accepted patterns to disable tx propagation in this model.

aguibert commented 5 years ago

This issue was discussed for the entirety of this week's hangout, and ultimately it seems like we have 2 conflicting usage patterns:

A) Traditional Executor-based access, for using the CompletionStage API. Here IBM is arguing that transaction propagation is bad because we can potentially have 2 threads accessing the same transaction at the same time -- a dangerous scenario as @njr-11 has pointed out. B) Reactive usage with sequential access across various stages(which may or may not run on different threads, but never two threads at once). Here RedHat is arguing that since sequential access is such a dominant use case we should let this be the default option.

I think both sides have made valid points, and it really comes down to different dominant usage patterns for different programming models (Executor-based parallelism vs. Reactive-based serial stages).

Proposed solution: Since Executor-based will use ManagedExecutors primarily, and Reactive-based will use ThreadContext primarily, simply use different defaults for the two APIs, namely:

@FroMage would this be an agreeable compromise?

gavinking commented 5 years ago

@aguibert yes, right, @FroMage and I chatted about this on the phone a couple of hours ago, and what I suggested to him was that transaction propagation should be considered a property of the executor (i.e. not of the CompletionStage).

Indeed, having to make each stage in a chain of callbacks explicitly declare that contexts are propagated at each step would be extremely verbose and very unnatural for the user.

So I think the spec should simply say that there are some executors that propagate these contexts, and some that don't. It wouldn't even be necessary for the spec to say anything about which executors do and which don't. (Note that there might be further scenarios not so far contemplated here.)

FroMage commented 5 years ago

As I told Gavin already on the phone, this seems like a good idea because it satisfies both use-cases. That you both came up with it independently seems further proof that it's a good idea. So I say we should try it out.

hutchig commented 5 years ago

I have worked on the MP Reactive (Streams Operators Messages) specs and am currently implementing the context handling in the OpenLiberty implementation.

We (the mp reactive workgroup) intentionly did not tie down the context propogation in the spec:

"Similarly, reactive streams objects that are generated by APIs in this specification will behave according to the Reactive Streams specification but aspects of their behaviour beyond that, such as handling contexts, buffering policy, fan-out and so on are implementation specific. Users should consult the relevant documentation for their environment."

and we verbally chatted about the fact that implementations would attempt to not contradict the approach of the MP Concurrency spec in this area.

However, I just want to stare clearly that when @aguibert in https://github.com/eclipse/microprofile-concurrency/issues/124#issuecomment-466083595 states "B) Reactive usage with sequential access across various stages(which may or may not run on different threads, but never two threads at once). Here RedHat is arguing that since sequential access is such a dominant use case we should let this be the default option."

That we intentionally do NOT specify that transaction context needs or should be propogated in MicroProfile implementations of Reactive Streams. A particular implementation is free to do this but it is not part of the specification.

For me, the term "Reactive" - means a particular form of pub-sub interface with ticket based back-pressure as per http://www.reactive-streams.org/ . It is something more than asynchronous work with a CompletableFuture callback so when I see something like "Reactive will do XYZ" in a spec groups working issues I immediately relate it to the MicroProfile Reactive Streams/Messaging behaviour.

Users may do too, so please be careful not to side-specify the expected behaviour of MP Reactive Streams/Messages in the concurrency spec. Transaction propogation there is vendor specific and not specified.

(I have a much less strong opinion on whether ThreadContect context closures capture all contexts by default. Typically these will NOT be used in MP Reactive Streams/Messaging application programs as the underlying system will propogate contexts. Or if not the user is going to do it it wpould be via a custom ReactiveStreamsEngine which would be using a managed executor. Asynchronicty is not programmed at the level of from within a single stream stage to the next stage and a previous element (to capture context from) is not involved in directly passing control to the following one at all with the recommended approach of using Builders or higher level APIs.)

hutchig commented 5 years ago

re tcx propogation for ThreadContext For me, from reading the Concurrency spec, it comes across as CDI, CompletableFuture and MP Config features on top of

javax.enterprise.concurrent:
   ContexService::createContextualProxy*      [MP ThreadContext]
   ManagedExecutor                                         [MP ManagedExecutor]

I think it might be good spec work, and certainly diplomacy, when there are two opinions about propogating txn context that are hard to resolve, to suggest that part of the spec (executors) goes one way (does not propogate txn context by default) and part of the spec (closures/ThreadContext) goes the other way, (does capture all the context including txn by default). And this does solve two current problems we have:

1 - some existing libraries will not work well with txn's spreading to multiple threads and work being done concurrently 2 - MP Concurrency working group is working hard to reach a concensus on txn propogation

But what is it that most benefits users once this spec is 'old'?

Consistent behaviour does have lots of benefits and I would have expected ThreadContect to behave like ContexService::createContextualProxy* particularly if the end goal is to integrate function into JEE so as not to confuse users long term.

Although not definitive (help?), when I looked at how this behaved I found: https://www.javacodegeeks.com/2014/07/java-ee-concurrency-api-tutorial.html "although context objects can begin, commit, or roll back transactions, these objects cannot enlist in parent component transactions."

So personally I would argue that ThreadContext should behave in the same way as ContexService::createContextualProxy regardless of the practical problems in implementation.

If this does not do txn by default and a user wants to propogate ALL the contexts in ThreadContext could we square this circle by using the MP Config integration feature?

FaultTolerance has done something like this aready (Ver2.1 Section 12.1) see override parameters globally

This would address the concerns of both sides - to enable the same succinct code for those that want to propogate everything included transactions but not tripping up users/libraries by doing it without any opt in or behaving differently to createContextualProxy @FroMage ? @gavinking ? @aguibert ?

Also if some vendors will not be able to support txn propogation in all scenarios do we need to make doing so an optional part of the spec? (Or some other spec mechanism).

FroMage commented 5 years ago

Well, we're probably going to rewrite certain sections of the spec to tie it less to EE-Concurrency, so that aspect is not necessarily problematic.

As for the MP-Config override, we do agree it's a desirable feature to let users change the default, but we find it orthogonal to the particular issue of the default mandated by the spec, because our implementation would use that mechanism to override the spec default to the default we expect our users to want, which would defeat the purpose of the spec mandating a default.

hutchig commented 5 years ago

As I mentioned Fault Tolerance, I should also chip in that the FT spec, for its @Asynchronous annotation, leaves txn propogation unspecified and IBM's implementation does NOT propogate transaction context onto any async thread used to run an @Asynchrous method as that would create two threads running under one txn at one time which we avoid. I don't think this is for any pricipled reason, we just have a large ecosystem of things we work with and our experience is that some of them (and user code) can be problematic when a txn appears on two threads in the one process at once. Not doing this also allows for some optimizations wrt locking/sync.

gavinking commented 5 years ago

As I mentioned Fault Tolerance, I should also chip in that the FT spec, for its @Asynchronous annotation, leaves txn propogation unspecified and IBM's implementation does NOT propogate transaction context onto any async thread used to run a @Asynchrous method

I'm not sure if your @Asynchronous annotation here is javax.ejb.Asynchronous, but if so, then it's quite clear that tx contexts are not propagated over async method calls. I don't think anyone would have any other expectation for that case.

gavinking commented 5 years ago

I'm not sure if your @Asynchronous annotation here is javax.ejb.Asynchronous

Oh no, my bad, looks like you're talking about:

https://github.com/eclipse/microprofile-fault-tolerance/blob/master/api/src/main/java/org/eclipse/microprofile/faulttolerance/Asynchronous.java

gavinking commented 5 years ago

Still, seems to me that Fault Tolerance's @Asynchronous methods are semantically similar, and have essentially the same programming model, as EJB @Asynchronous methods, so I would at least naively have expected the same semantics for transaction propagation (i.e. not propagated). Perhaps I'm missing something, not being familiar with what's going on in Fault Tolerance.

njr-11 commented 5 years ago

Based on a number of comments here, it sounds like the compromise proposed by @aguibert to have different defaults for ManagedExecutor and ThreadContext has a good amount of support behind it. While I still prefer to keep the default that we already have in the spec for both (cleared), I am willing to agree to this compromise as a way forward with the spec. I will start looking into the spec updates needed for this compromise so that we have them prepared, but in the meantime, I would certainly encourage continued discussion of this issue. We will plan to make the final decision on the next MicroProfile Concurrency call, and hopefully be able to take the spec forward to its first release candidate after that point.

hutchig commented 5 years ago

Having read the above and chatted with @njr-11 about this as a default config setting, I can support this compromise in the default config settings as long as Txn propogation support is not mandatory behaviour to comply with the spec or pass the tck nor implied as the specified behaviour of MicroProfile Reactive Streams.

gavinking commented 5 years ago

FTR, I just opened https://github.com/eclipse/microprofile-fault-tolerance/issues/415. It occurs to me that the @Asynchronous annotation probably belongs in MP-Concurrency, not in MP-Fault Tolerance.

I hope that some work is being done here to nail down a relationship between @Asynchronous and ManagedExecutor. It seems to me that whatever the relationship is, the semantics should be defined here, not there.

cescoffier commented 5 years ago

First, I would like to stress out the need of MicroProfile Concurrency for the reactive part of MicroProfile.

To mention a few discussions we had in the reactive hangout:

About the transaction issue, I'm in favor of the proposal made in https://github.com/eclipse/microprofile-concurrency/issues/124#issuecomment-466083595. More specifically, lots of MicroProfile specifications are making assumptions on the execution model that make them unsuitable in reactive applications. Having MP concurrency considering Traditional and Reactive would show clear intention to embrace reactive. Also, as said above, supporting propagation is a feature that most reactive application developers are expecting.

Finally about the MicroProfile Reactive Streams Operators. The specification does not handle context propagation at that point. As mentioned above, it's more a: "let's wait until MP concurrency makes it work ;-)." At that very moment, the support for propagation would be implementation specific I would say. We didn't discuss it recently (as we are focusing on reactive messaging), but it might be something to add. Now whether or not tx should be propagated, again, the proposal made above looks good to me.

Emily-Jiang commented 5 years ago

I finally caught up with the whole conversation. Can we step back and look at the use case?

  1. Can someone explain under what circumstances, transaction context needs to be propagated and what circumstances not?
  2. What are the advantages and disadvantages for propagating transaction context?
njr-11 commented 5 years ago

@Emily-Jiang if it helps, here are some examples that I wrote up for myself with some usage patterns on how transaction context propagation might look on the end user side, although this doesn't go into detail on the reasons for doing so,

Transaction context propagated, only used serially, no blocking

tx.begin();
try {
    // add dependent stages to incomplete stage1 (must ensure stage1 cannot run, otherwise 2 threads will be in the transaction at once)
    stage2 = stage1.thenApply...
    stage3 = ...
    stage4 = ...
    stage4.handle((failure, result) -> {
        try {
            if (tx.getStatus() == Status.STATUS_MARKED_ROLLBACK || failure != null)
                tx.rollback();
            else
                tx.commit();
        } catch (Exception x) {
            throw new CompletionException(x);
        }
    });
} finally {
    ... vendor-specific API used to obtain the transaction manager
    tranManager.suspend();
}
...
stage1.complete(value);

Transaction in parallel on different threads, with blocking on main thread

tx.begin();
try {
    ... add dependent stages
    ... do other transactional stuff on main thread
    ... wait for all stages with transactional work to finish (join/get operations)
} finally {
    if (tx.getStatus() == Status.STATUS_MARKED_ROLLBACK)
        tx.rollback();
    else
        tx.commit();
}

transaction in parallel on 2 threads, without blocking

tx.begin();
try {
    ... add dependent stages stage2 and stage3
} finally {
    ... vendor-specific API used to obtain the transaction manager
    tranManager.suspend();
}
stage2.thenAcceptBoth(stage3, (result2, result3) -> {
    try {
        tx.commit();
    } catch (Exception x) {
        throw new CompletionException(x);
    }
}).exceptionally(failure -> {
    try {
        tx.rollback();
    } catch (Exception x) {
        throw new CompletionException(x);
    }
    throw new CompletionException(failure);
});

The above gets even more complicated if more than 2 threads are involved.

njr-11 commented 5 years ago

@FroMage As a todo from last week's MP Concurrency meeting, I spoke with our top level architects about the possibility of switching the configuration defaults so that the default is to propagate transaction context rather than clearing it. Rather than waiting for the Thursday meeting, I'm sharing the response now because it impacts how the spec can proceed, and I want to give others as much time as possible to react to it. The response from our architects is that we cannot continue our support for this specification or its inclusion in MicroProfile if it is changed to make the optional behavior of propagating transactions the default, nor can we support a change to the spec (at least in version 1.0) that would make propagation of transactions a requirement. We are only to proceed with a version 1.0 where all defaults reflect mandatory behavior that is present in both of the known implementations.

kwsutter commented 5 years ago

As one of the unnamed architects in @njr-11's post, I thought I would give a little more background on the discussion... Coming from a Java EE perspective, it's difficult to accept the idea of introducing default values that are not consistent with the current Java EE expectations. We want MicroProfile to be acceptable to current Java EE users. Introducing default values for transaction propagation that would not allow current implementations to work "out of the box" seems wrong. We're not arguing to not allow the configuration for transaction propagation, just don't make it the default.

Why couldn't this approach be suitable for the Concurrency 1.0 release? Hopefully, Jakarta EE will be on more solid ground shortly and this work could more easily merge with Jakarta EE Concurrency. And, then the topic of transaction propagation being the default could properly be vetted out.

We also discussed the compromise that was previously posted. Although this would help with the more common ManagedExecutor scenario, leaving transaction propagation as the default for the ThreadContext would introduce another inconsistency that doesn't seem palatable.

Bottom line is that we don't agree with changing the defaults to propagate transaction context. At least not with this 1.0 release. Let's make MP Concurrency as acceptable as possible across as many implementations as possible.

FroMage commented 5 years ago

The response from our architects is that we cannot continue our support for this specification or its inclusion in MicroProfile if it is changed to make the optional behavior of propagating transactions the default, nor can we support a change to the spec (at least in version 1.0) that would make propagation of transactions a requirement.

Propagation of transactions would only become a requirement if supported by the thread context provider for it. You've made it clear that your implementation is going to be incomplete since it will have a thread context provider for it that will not support propagation. We don't think the spec should be limited by the status of one implementation, and rather that the implementations should take the time to implement the spec entirely. One alternative is to ship your implementation with no thread context provider for transactions.

We are only to proceed with a version 1.0 where all defaults reflect mandatory behavior that is present in both of the known implementations.

Our implementation will support transaction propagation very shortly. Again, we don't think we should limit the spec in such a fundamental way due to temporary shortcomings of particular implementations.

Note that I didn't want to bring this up last week because I thought we could come to an agreement, and the compromise seemed good, but I also got a clear mandate last week to state that Red Hat will not be able to agree to this spec if we cannot support transaction propagation by default, or at the very least, stop specifying that behaviour and make it implementation-dependent.

FroMage commented 5 years ago

Although this would help with the more common ManagedExecutor scenario, leaving transaction propagation as the default for the ThreadContext would introduce another inconsistency that doesn't seem palatable.

Our view is that in the context of reactive async programming, the ManagedExecutor use-case is really the least useful of the two, and users are going to mostly use ThreadContext, unless we can retrofit our worker pools as ManagedExecutors, in which case we'll definitly want transaction propagation for it since all request pipeline executions will be serialised.

Our opinion about this is that EE Concurrency attacked the problem from the point of view of parallelism within a request, and thus specified a behaviour that made most reactive programming use cases unworkable, because users expect propagation of every context that was initially part of the request context. At the time EE Concurrency was made, it could clearly not anticipate the reactive programming patterns that appeared later and became the norm, so I'm not criticising it for that choice.

Now, however, it would be irresponsible to ignore those reactive async programming use-cases, especially in MP with the trend in support for them, and also given JAX-RS/Servlet support for it already years ago. We should not limit the usefulness of MP-Concurrency on past choices of EE-Concurrency that may no longer be the dominant use-case.

FroMage commented 5 years ago

@Emily-Jiang Here is the example of what typical use-cases would look like, blocking:

    @Inject 
    Vertx vertx;
    @Inject
    EntityManager em;

    @Path("context")
    @GET
    public String contextBlocking(@Context UriInfo uriInfo) {
        WebClient client = WebClient.create(io.vertx.reactivex.core.Vertx.newInstance(vertx));
        // blocking HTTP GET
        HttpResponse<String> response = client.get(80, "google.com", "/")
                .as(BodyCodec.string())
                .rxSend().blockingGet();
        // this works
        uriInfo.getAbsolutePath();
        // this works because we have CDI and transaction
        Person person = new Person();
        person.name = "stef";
        em.persist(person);
        return "OK "+response.body().length();
    }

And this is how you would turn the request async, but it won't work due to contexts not being propagated:

    @Path("context-single")
    @GET
    public Single<String> contextSingle(@Context UriInfo uriInfo) {
        WebClient client = WebClient.create(io.vertx.reactivex.core.Vertx.newInstance(vertx));
        // async HTTP GET, turns the request asynchronous
        return client.get(80, "google.com", "/")
                .as(BodyCodec.string())
                .rxSend().map(response -> {
                    // this fails because no RESTEasy context
                    uriInfo.getAbsolutePath();
                    // this fails because no CDI context, but would fail because no transaction otherwise
                    Person person = new Person();
                    person.name = "stef";
                    em.persist(person);
                    return "OK "+response.body().length();
                });
    }

And here is how you would fix the previous example with context propagation:

    @Inject
    ThreadContext threadContext;

    @Path("context-cs")
    @GET
    public CompletionStage<String> contextCompletionStage(@Context UriInfo uriInfo) {
        WebClient client = WebClient.create(io.vertx.reactivex.core.Vertx.newInstance(vertx));
        // async HTTP GET, turns the request asynchronous
        Single<HttpResponse<String>> responseSingle = client.get(80, "google.com", "/")
                .as(BodyCodec.string())
                .rxSend();
        // this particular call will be unnecessary really soon
        CompletionStage<HttpResponse<String>> cs = toCompletionStage(responseSingle);
        // make it capture the context
        return threadContext.withContextCapture(cs)
            .thenApply(response -> {
                    // this works now
                    uriInfo.getAbsolutePath();
                    // this should work too
                    Person person = new Person();
                    person.name = "stef";
                    em.persist(person);
                    return "OK "+response.body().length();
                });
    }
FroMage commented 5 years ago

At this point, my personal opinion is that the best way to proceed is to drop the requirement to single out transaction contexts from the spec and allow implementations to not propagate certain supported contexts by default.

cescoffier commented 5 years ago

As @FroMage just said, it seems the only way to get something released.

Note that it's not the first spec defining behavior not compatible with the reactive paradigm. A larger discussion needs to happen to clearly state where MicroProfile is positioning itself in term of reactive.

manovotn commented 5 years ago

I also agree with Stephane and Clement. As both parties expressed their reluctance to accept the other option, we should start looking into how we evict the defaults from the spec and let the implementations define them.

kwsutter commented 5 years ago

I'm sure you all realize this, but if the default propagations are not specified and each impl defines their own defaults, then applications will not be portable between MP Concurrency implementations. That is something that we have pushed for since day one of this project. We have used this model of portable applications for many demos at conferences -- showing the same microservice/application running on various implementations. And, sure we could still code it "correctly" to have it run on multiple impls, but not with the defaults (if this new compromise is accepted).

FroMage commented 5 years ago

Well, given the ability to override defaults with config files, we could warn users to specify their transaction propagation default in a config file, which would override whatever the implementation default to.

gavinking commented 5 years ago

OK, so after reading over this latest discussion, I'm left confused as to what is unacceptable about the "compromise" we previously thought we had reached, where transaction propagation was an aspect of the executor which was orchestrating the callbacks to CompletionStages, rather than trying to define a single set of defaults at the level of the CompletionStage API.

Really, CompletionStage—a Java SE API—is much too general for there to be one correct thing at that level.

We know that:

  1. There is a usecase for ManagedExecutors which don't propagate tx contexts between CompletionStages. This is the typical case where you have a client which forks off one or more "jobs" running in parallel. The expectation is that a transaction is complete when the job "finishes", but that you might want to do something after the job(s) finish(es).
  2. There is also a usecase for chains of CompletionStages within a single tx and single logical thread of execution. Here we have pooled worker threads that are allocated to to do work at each step of the chain, but these threads never run in parallel for a given stream.

It's also been pointed out by Sanne that there is a third usecase for ManagedExecutors which do propagate a client tx context. It seems that there is sort of an impression out there that transactions are single-threaded but that's simply not the case. The JTA spec allows multithreaded transactions (i.e. for more than one thread to be associated with a given tx context) and our transaction manager Narayana apparently supports that. And indeed, one can see how this could make a lot of sense in, for example, a distributed transaction that does work against multiple remote resources. However, this is admittedly an advanced usecase, and I don't argue that it should be the default behavior for ManagedExecutors. (I do think it should be allowed, however.) And more importantly this is not the usecase we're hung up on here in this discussion.

The question here is what should be the default behavior for reactive streams, and I don't think anybody has really disputed that transactions should be propagated in this case. What we're dealing with here is very much a single logical thread that has been broken up into CompletionStages purely as a performance optimization to release stack space while the logical thread is idle.

I simply don't see how the default behavior for reactive streams can't be distinct from the default behavior for explicit calls to a ManagedExecutor. The user is not even interacting with the same APIs!

manovotn commented 5 years ago

I'm sure you all realize this, but if the default propagations are not specified and each impl defines their own defaults, then applications will not be portable between MP Concurrency implementations.

So long as the user overrides defaults or declares the options instead of using defaults, it will be portable. You only sacrifice portability when the user relies on implementation defaults which requires them to be aware of what implementation they run and how it behaves there in the first place.

Plus you can state in the spec that the implementations define defaults, but if users want their own or if they wish their app to be portable they need to provide a config for defaults. That way you have a spec-mandated way for portability. I know it's not the prettiest thing in the world, but would work...

I would obviously be in favor of full portability as well, but from the several meetings and this discussion there doesn't seem to be any option that would be acceptable for all.

gavinking commented 5 years ago

@kwsutter

Coming from a Java EE perspective, it's difficult to accept the idea of introducing default values that are not consistent with the current Java EE expectations.

My understanding is that nobody has argued for this.

There is no current Java EE expectation for transaction context propagation in reactive streams, as far as I am aware, and therefore we must define that expectation.

gavinking commented 5 years ago

@kwsutter

I'm sure you all realize this, but if the default propagations are not specified and each impl defines their own defaults, then applications will not be portable between MP Concurrency implementations.

I agree that we should shoot for portable behavior. I don't think the spec should leave this undefined.

gavinking commented 5 years ago

@Emily-Jiang

I finally caught up with the whole conversation. Can we step back and look at the use case?

  1. Can someone explain under what circumstances, transaction context needs to be propagated and what circumstances not?

I think, in a nutshell, the default should be:

  1. tx context is propagated within a single logical thread of execution.
  2. tx context is not propagated to a forked logical thread of execution.

That is to say—by default—you don't have two concurrent threads processing the same transaction.

Note that I'm using the term logical thread here, to distinguish it from a JVM Thread, since they are simply not the same thing.

However, as a caveat to this, there do exist usecases for having a "multithreaded transaction", and JTA allows that. In my view, this should be something you have to explicitly request if you want it.

  1. What are the advantages and disadvantages for propagating transaction context?

If you don't propagate the transaction context between CompletionStages that form a single logical unit of work, then you lose the ACID properties of that unit of work.

FroMage commented 5 years ago

Another possible compromise could be to allow for the existence of ThreadContextProvider that only support clearing contexts. This way the implementations that don't support a certain kind of context propagation could guarantee their users that those contexts would never be propagated at all, and it could be an error thrown if the user explicitly asks for this context to be propagated.

This way we can make the default to propagate every supported context, and clear every non-supported context.

gavinking commented 5 years ago

A further clarification, because I perceive a possible misunderstanding here.

All we need to say is that executors that manage the processing of reactive streams propagate transaction contexts across the various stages of a given stream.

We're not asking for a different default behavior for a ManagedExecutor explicitly constructed/configured by user code. A ManagedExecutor (if any) that processes reactive streams is something constructed implicitly by the infrastructure.

njr-11 commented 5 years ago

@gavinking does a ManagedExecutor that is constructed implicitly by the infrastructure really need a default at all? It can specify whatever values it wants when it build its ManagedExecutor instance.

ManagedExecutor.builder().propagated(ThreadContext.ALL_REMAINING).cleared(/*none*/).build();

The defaults we should be discussing in this issue are for instances built by the user in the case where they do not specify propagated/cleared.

gavinking commented 5 years ago

constructed implicitly by the infrastructure really need a default at all? It can specify whatever values it wants

Well the spec needs to say what those values are because it's part of the container environment.

aguibert commented 5 years ago

@FroMage regarding this comment:

We don't think the spec should be limited by the status of one implementation, and rather that the implementations should take the time to implement the spec entirely. One alternative is to ship your implementation with no thread context provider for transactions.

I don't want Liberty/IBM's opposition to Tx propagation to be seen as us trying to get out of work, or complaining that "it's too hard to implement". I agree, such an argument would not be a valid way to oppose function in a spec. Rather, we are saying that our Tx experts and architects have clearly expressed and pointed out that Tx propagation is fundamentally dangerous in a multi-threaded environment. We should "put the guard rails up" for our users, and not define defaults that allow users to shoot themselves in the foot.

njr-11 commented 5 years ago

constructed implicitly by the infrastructure really need a default at all? It can specify whatever values it wants

Well the spec needs to say what those values are because it's part of the container environment.

I agree that a spec needs to define what they are (the spec that constructs the instance, not even MP Concurrency here). But there is no reason those would need to be defined to match MP Concurrency's default config. Any other spec can build upon MP Concurrency and define its own ManagedExecutor or ThreadContext instances, and in doing so each such specification in turn owns the definition of how to configure them.

gavinking commented 5 years ago

But there is no reason those would need to be defined to match MP Concurrency's default config.

If by "MP Concurrency's default config" you mean the default config of a user-constructed/declared ManagedExecutor then yes, I agree, that's the point I was trying to make.

Any other spec can build upon MP Concurrency and define its own ManagedExecutor or ThreadContext instances, and in doing so each such specification in turn owns the definition of how to configure them.

Right, as long as MP Concurrency contemplates that there do exist ManagedExecutors that do propagate transaction contexts, it seems perfectly fine to me.

As I said right up at the top: as long as transaction propagation (or not) is an aspect of an executor, then I think we're good.

njr-11 commented 5 years ago

Right, as long as MP Concurrency contemplates that there do exist ManagedExecutors that do propagate transaction contexts, it seems perfectly fine to me.

Absolutely. MP Concurrency fully allows you to configure transaction context to be propagated and treats it first class alongside the other thread context types that are defined and called out within the spec. Granted, some implementations might not support transaction context propagation, either at all or under certain circumstances (for example, propagating a transaction to 2 threads at once), in which case they are obligated to raise an error if the unsupportable behavior is attempted.

As I said right up at the top: as long as transaction propagation (or not) is an aspect of an executor, then I think we're good.

Yes, it is very much an aspect of the managed executor, and the ability to configure it at that level reflects this. And just as it is configurable for individual ManagedExecutor instances, transaction propagation as well as other types of context propagation, are similarly configurable for individual ThreadContext instances.

If I understand right, then I think the MP Concurrency spec as written meets all of these needs. Having elaborated on this discussion, are other participants from RedHat also in agreement? (@FroMage ) or are there still concerns.

FroMage commented 5 years ago

Rather, we are saying that our Tx experts and architects have clearly expressed and pointed out that Tx propagation is fundamentally dangerous in a multi-threaded environment. We should "put the guard rails up" for our users, and not define defaults that allow users to shoot themselves in the foot.

And our Tx/Reactive experts and architects point out that in a reactive use-case most users will not want to spawn threads at all, since that's against the entire point. We will probably even provide a ManagedExecutor that abstracts over the existing request worker pool and schedules async units of work on top of the same thread which spawned them, once that thread is free again, thus guaranteeing sequential execution (the same model as node.js and Vert.x).

So we essentially disagree on the most common use-case we will serve.

FroMage commented 5 years ago

If I understand right, then I think the MP Concurrency spec as written meets all of these needs. Having elaborated on this discussion, are other participants from RedHat also in agreement? (@FroMage ) or are there still concerns.

Not at all. For ManagedExecutor the current default is acceptable since we don't expect users will create their own, but will instead look up existing ones provided by the container with the right default.

For ThreadContext, which we believe will be the most common use-case, we think most users will just use @Inject ThreadContext context and get the wrong default.

njr-11 commented 5 years ago

For ThreadContext, which we believe will be the most common use-case, we think most users will just use @Inject ThreadContext context and get the wrong default.

Suppose we stripped the defaults from ThreadContext altogether, and in doing so, forced the user to specify values for propagated/cleared/unspecified in order to have a valid ThreadContextConfig annotation or builder? We could do this for the 1.0 release, with the flexibility to later agree on the defaults and add in a future version. This would lessen usability for now, but might be worth it if it ends the dispute over what the default must be.

FroMage commented 5 years ago

We will probably even provide

Between this and the fact that IBM's implementation does not yet support transaction propagation, we're very uncomfortable with finalising a spec when both our implementations are clearly not ready yet, nor released, and real users have not had the time to validate our assumptions.

FroMage commented 5 years ago

Suppose we stripped the defaults from ThreadContext altogether, and in doing so, forced the user to specify values for propagated/cleared/unspecified in order to have a valid ThreadContextConfig annotation or builder? We could do this for the 1.0 release, with the flexibility to later agree on the defaults and add in a future version. This would lessen usability for now, but might be worth it if it ends the dispute over what the default must be.

This would not improve the situation for the use case of users who want transaction propagation with the current defaults. This would only make it worse for people who don't want transaction propagation, which we believe will be much less common.

njr-11 commented 5 years ago

Suppose we stripped the defaults from ThreadContext altogether, and in doing so, forced the user to specify values for propagated/cleared/unspecified in order to have a valid ThreadContextConfig annotation or builder? We could do this for the 1.0 release, with the flexibility to later agree on the defaults and add in a future version. This would lessen usability for now, but might be worth it if it ends the dispute over what the default must be.

This would not improve the situation for the use case of users who want transaction propagation with the current defaults. This would only make it worse for people who don't want transaction propagation, which we believe will be much less common.

Except to avoid the situation you brought up where the user ends up with defaults they didn't expect, it would not improve either scenario at all, and in fact makes them both less usable. But that isn't the point. The point is to find something that could be an acceptable compromise to both parties that cannot agree on a default.

Between this and the fact that IBM's implementation does not yet support transaction propagation, we're very uncomfortable with finalising a spec when both our implementations are clearly not ready yet, nor released, and real users have not had the time to validate our assumptions.

To help in getting that sort of feedback from users, we are putting out a beta of our MP Concurrency implementation. However, I don't expect users to be willing to do anything with it until we can get a release candidate of the spec out there for them to start coding to.

gavinking commented 5 years ago

Folks, excuse my ignorance here, but why precisely is ThreadContext an injected thing?

Why can't I get one by writing ManagedExecutor.getThreadContext() or ReactiveStreams.getThreadContext() or whatever?

What I mean is, you have some singleton instances hanging around, each reflective of a different scenario.