eclipse / microprofile-lra

microprofile-lra
Apache License 2.0
99 stars 30 forks source link

Add a TCK test which ensures the compensate is called only when participant's business method finishes #301

Closed ochaloup closed 4 years ago

ochaloup commented 4 years ago

It would be nice to add a TCK tests which verifies behaviour of compensation for participant which hasn't finished business method processing at time when LRA is cancelled. Meaning, the LRA participant still processes some business actions and while the LRA implementation calls the compensate method on it.

The test scenario would be following:

xstefank commented 4 years ago

+1 good idea.

mmusgrov commented 4 years ago

That's a spec change. How will the part of the implementation that issues the LRA callbacks know when the method has finished - the part of the implementation running locally with the service would have to start a conversation with the LRA lifecycle manager at the end of the method which adds overhead (which is additional to the overhead at the start of the method call). Plus, that conversation would need to be built into the interop spec making it more complex.

It should be the responsibility of the business logic to coordinate with its business compensation logic.

rdebusscher commented 4 years ago

For me that is a detail for the developer, not a problem for the LRA implementation.

ochaloup commented 4 years ago

@mmusgrov @rdebusscher I don't agree with you ;-) I think the developer should be defended by the LRA implementation from the issues of the eventual consistency as far as it can be.

If there is no assurance from the LRA implementation then me as the developer has to write mostly any @LRA handling with this in my mind. Let's me sketch an example (this is not a java code just a way to express my point of view)

@Path("/business-method")
@LRA
public void doSomething() {
  // do some remote call to a third-party API
  // insert to dababase an order with status "IN PROGRESS"
}

@Compensate
@Path("/compensate")
public void compensateWork() {
  // update the dababase order  record with status "CANCELED"
}

I assume this is quite usual type of code that could be coded with LRA. When the API call takes a bit longer then compensate could be called before the insert of "IN PROGRESS" state is executed. For coding this correctly I need to consider the business call has not happened yet and being able to throw some REST error (like 5xx) in case. I considered that complicated and error prone.

@mmusgrov the LRA implementation does not need to know anything about the method being finished. It's the LRA client part which may check if the business method was already finished. It it was then compensate can be invoked, if it was not then it asks the LRA implementation to retry in a while later.

ochaloup commented 4 years ago

@rdebusscher as I'm thinking about this further I assume I haven't considered the case that the compensate method could be used for interruption of the business method when the LRA was cancelled. Have you meant that?

ochaloup commented 4 years ago

@mmusgrov ok, I can see your comment now in gitter about two different JVMs where one runs the business logic and the other runs the compensation logic. I understand your point now.

That's right then there is need a change of the spec in some way which is not desired.

ochaloup commented 4 years ago

@mmusgrov @rdebusscher thank you for the discussions and your points. It helped me to clarify my statement. I assume my arguments are not convenient.

@mmusgrov @rdebusscher @xstefank then I would ask the same question but with a different scenario. Would be ok from your perspective to add a TCK testcase where it's checked that Compensate method could be called, and has to be permitted to be invoked, at any time. It means even at time when the business method is currently in progress?

I assume this could be considered as implementation specific. On the other hand e.g. for the case that I want to interrupt the business method execution immediatelly when LRA was cancelled, then such an assurance is appropriate to be provided by spec. What do you think?

mmusgrov commented 4 years ago

Your last point was interesting. But I will address each one in turn:

I think the developer should be defended by the LRA implementation from the issues of the eventual consistency as far as it can be.

If there is no assurance from the LRA implementation then me as the developer has to write mostly any @lra handling with this in my mind. Let's me sketch an example

We have had this example in a previous thread - the developer would need to synchronise the work done in the doSomething and compensateWork methods.

I assume this is quite usual type of code that could be coded with LRA. When the API call takes a bit longer then compensate could be called before the insert of "IN PROGRESS" state is executed.

If the Compensate method determines that the business method is still in progress it returns 202 Accepted.

For coding this correctly I need to consider the business call has not happened yet and being able to throw some REST error (like 5xx) in case. I considered that complicated and error prone.

If the Compensate method determines that the business method is in progress it returns 202 Accepted.

@mmusgrov the LRA implementation does not need to know anything about the method being finished. It's the LRA client part which may check if the business method was already finished. It it was then compensate can be invoked, if it was not then it asks the LRA implementation to retry in a while later.

The @Compensate method is normally an endpoint which is called directly so you would need to install JAX-RS filters and have them all coordinate with each other.

If you overcame that hurdle then you could immediately return 202 Accepted which would be quite nice and your proposal would make sense.

rdebusscher commented 4 years ago

Would be ok from your perspective to add a TCK testcase where it's checked that Compensate method could be called, and has to be permitted to be invoked, at any time. It means even at time when the business method is currently in progress?

Spec says it can be any time after the LRA is Closed / Canceled, so spec wise nothing need to change. But such a TCK test will be fragile as it depends on exact timings.

ochaloup commented 4 years ago

But such a TCK test will be fragile as it depends on exact timings.

@rdebusscher no, I think it won't be. There could be used java synchronization utilities (like count down latch) and it will be checking the exact timing without need of using any timeout stuff.

ochaloup commented 4 years ago

The @Compensate method is normally an endpoint which is called directly so you would need to install JAX-RS filters and have them all coordinate with each other.

@mmusgrov I would just to clarify this point otherwise I see and agree. the @Compensate would need to install the JAX-RS filter. But the coordination would not mean too much overhead I assume. There could be used some ConcurrentHashMap and the synchronization check would be quite fast.

ochaloup commented 4 years ago

@xstefank @rdebusscher @mmusgrov for this issue I understand the original behaviour was wrongly stated. I have a question if having the TCK test which puts the expectation that compensate will be called anytime during business method processing would be considered as a good expectation from the LRA processing perspective. Such expectation put on the LRA implementation makes possible for users running long LRA business actions counting with fact that by cancelling of LRA they could interrupt the processing as the @Compensate will be called. I understand the @rdebusscher 's concerns about timing issues but I think the test could be written without any timeout reliably.

My question is if you think such expectation is correct. If so then TCK test makes sense. Thanks for your points.

ochaloup commented 4 years ago

I'm closing this issue as the original scenario and the issue title contradict with what should be tested. I created a new issue here: https://github.com/eclipse/microprofile-lra/issues/305 which reflects the scenario based on the discussion here. (The implementation details may be discussed on the PR when it's created.)