Add transaction support

alexandertokarev commented 9 months ago

For server implementations that are using a transactional database to store the data all data operations (retrieving and modifying resource) should be executed within a transaction

Some operations can be made transactional in implementations, for example one can implement UserHandler and annotate some methods with spring @Transactional annotation:

@Transactional
public User createResource(User resource, Context context) {
}

@Transactional(readOnly=true)
public PartialListResponse<User> listResources(..) {
}

But if we annotate the updateResource method:

 @Transactional
 public User updateResource(User resource, Context context)

it would not be correct behaviour.

While PATCH/PUT request processing the ResourceEndpointHandler first calls ResourceHandler.getResource method to obtain previous resource version, then performs ETag validation, then applies patch operation against previous resource version and then calls ResourceHandler.updateResource All these steps should be executed within the same transaction. Also first call of ResourceHandler.getResource method may want to lock the row to prevent access from parallel transactions, so the getResource method should be aware of its call purpose: whether the result will be used for further update operation or for sending to the client output

In SQL the desired behaviour looks like as follows:

 BEGIN TRANSACTION 

 SELECT ... FROM users WHERE id = 'some_id' FOR UPDATE -- call getResource method and lock the row
 -- ETagHandler.validateVersion
 -- patchRequestHandler.handlePatchRequest
 UPDATE users set ... WHERE id = 'some_id' -- call updateResource method

 COMMIT

Consumers doBeforeExecution/doAfterExecution are not fully siutable for this purpose. Now we can not begin transaction in doBeforeExecution consumer and commit/rollback in doAfterExecution consumer because of the following reasons:

while bulk request processing the doBeforeExecution called before each operation but the doAfterExecution called only once at the end
Can not lock row using SELECT FOR UPDATE as the getResource method knows nothing about call purpose
Exceptions thrown from doAfterExecution consumer are not transferred to ScimErrorResponse
Can not mark a transaction as read-only because doBeforeExecution consumer provides no info about method

Will provide a pull request

alexandertokarev commented 9 months ago

Have no push permissions, so attached a patch file: Add_transaction_support.patch

Captain-P-Goldfish commented 9 months ago

Hi,

why not simply adding the @Transactional annotation to the resourceEndpoint call? That was my approach up until now and it I haven't got any problems until now:

@Transactional
@RequestMapping(value = "/**", method = {RequestMethod.POST, RequestMethod.GET, RequestMethod.PUT,
                                           RequestMethod.PATCH,
                                           RequestMethod.DELETE}, produces = HttpHeader.SCIM_CONTENT_TYPE)
  public @ResponseBody String handleScimRequest(HttpServletRequest request,
                                                HttpServletResponse response,
                                                @RequestBody(required = false) String requestBody)
  {
    Map<String, String> httpHeaders = getHttpHeaders(request);
    String query = request.getQueryString() == null ? "" : "?" + request.getQueryString();

    ScimAuthentication scimAuthentication = new ScimAuthentication();

    ScimResponse scimResponse = resourceEndpoint.handleRequest(request.getRequestURL().toString() + query,
                                                               HttpMethod.valueOf(request.getMethod()),
                                                               requestBody,
                                                               httpHeaders,
                                                               new Context(scimAuthentication));
    response.setContentType(HttpHeader.SCIM_CONTENT_TYPE);
    scimResponse.getHttpHeaders().forEach(response::setHeader);
    response.setStatus(scimResponse.getHttpStatus());
    return scimResponse.toPrettyString();
  }

this works great with ETags and BulkRequests. I use JPA with JTA transactions here and the behaviour is just as expected. I assume you made the ResourceHandler itself a bean and you are trying to manage the transactions within the different methods?

Did you try my approach? Why is it not working in your case?

Captain-P-Goldfish commented 9 months ago

I do not want any misunderstandings here. I understand why you need getForUpdate-method and the idea is a pretty good one. Did not think of it myself. But I am currently still doubting the benefit of the added Transactionmanager.

alexandertokarev commented 9 months ago

Hi,

why not simply adding the @transactional annotation to the resourceEndpoint call?

I think this approach have some problems with rollback conditions. By default the transaction will roll back on RuntimeException and Error. But resourceEndpoint usually does not throw exceptions, it returns an error response instead, please see https://github.com/Captain-P-Goldfish/SCIM-SDK/blob/917110ca202ceb6575d89ce2bbea138b1f6efa28/scim-sdk-server/src/main/java/de/captaingoldfish/scim/sdk/server/endpoints/ResourceEndpoint.java#L196 So the transaction will always be commited even though DataAccessException is thrown by ResourceHandler

But this is not a big problem, we can be using TransactionTemplate and roll back transactions on getiing an ErrorResponse from resourceEndpoint

this works great with ETags and BulkRequests.

So all operations in a bulk request executed in the same transaction. This means either all the operations are successfull or all the operations are failed. Is it always a desired behaviour? Consider response contains the result of all processed operations, some operations have "status": "200" but the last one has "status": "400" In my opinion if we roll back the whole transaction we roll back changes in all operations in the bulk and "status": "200" for these operations is not a correct answer.

Also it would be great if we can mark transactions for getting resources as read only, allowing for corresponding optimizations at runtime. At the resourceEndpoint level we don't have info about the method, we have to parse url (listResources can be called via POST method for example)

I tried to give some flexibility to the end user by providing an appropriate TransactionManager from ResourceHandler. But may be there are some another good solutions, I only doubt the transaction markup at the controller level.

Captain-P-Goldfish commented 9 months ago

Okay lets go this through in detail:

The transaction will only rollback on RuntimeException if not catched before the method annotated with @Transactional because the transaction is handled in the surrounding spring-bean-wrapper.
- So annotating the main-method that calls the resourceEndpoint will not rollback on RuntimeException because these are catched in the ResourceEndpointHandler
- Only exception to this rule is if the entityManager throws a corresponding exception. This would mark the transaction immediately as rollback. (Point for you here)
On BulkRequests:
- If the "failOnErrors" attribute is not specified or the service provider has not reached the error limit defined by the client, the service provider will continue to process all operations.
  
  This behaviour is respected by the ResourceEndpoint
- Operations should fail before they are written to database. (Exception to this rule is of course concurrent modifications)

The rollback-only behaviour is indeed a problem but I think you might break the rule for BulkRequests if you use it like this. Lets say you have a BulkRequest with failOnErrors=2. When you get the third error the whole transaction should be rolled back instead only part of it.

I will think a bit more over it today but in the end I guess there is also no harm in adding it. But I will probably rename the classes. These are more interceptors than transactionHandlers.

alexandertokarev commented 9 months ago

Good evening,

Lets say you have a BulkRequest with failOnErrors=2. When you get the third error the whole transaction should be rolled back instead only part of it.

I'm not sure that we should roll back the whole transaction. Also, I think we should not execute a bulk request in one transaction as well.

failOnErrors An integer specifying the number of errors that the service provider will accept before the operation is terminated and an error response is returned.

But I think this doesn't mean we should roll back previous successful operations in the bulk. In my opinion in case failOnErrors is reached we should skip processing further operations and return an error bulk response (status >= 400 ) that contains the result of all processed operations (successful and failed).

The service provider MUST continue performing as many changes as possible and disregard partial failures

The service provider response MUST include the result of all processed operations.

The status attribute includes information about the success or failure of one operation within the bulk job.

I believe we should execute each operation in the bulk in a separate transaction, let me explain why:

Let's say we store data in three tables:

users
groups
users_to_groups

We have the following implementation of UserHandler.updateResource method:

public User updateResource(User user, Context context) {
    userDao.saveUser(user);
    userToGroupDao.deleteAllGroupsForUser(user.getId()); // clear previous users_to_groups records
    userToGroupDao.linkGroupsToUser(user.getId(), user.getGroups()); // insert records into users_to_groups table
}

We receive a bulk request with failOnErrors=10 and two PATCH operations for user1 and user2. This request will never reach failOnErrors. Let's say first operation was successful, but the second one failed as userToGroupDao.linkGroupsToUser method tried to insert non-existent groupId and violated foreign key constraint.

If we execute this bulk request in one transaction how should we end the transaction: commit or rollback?

if rollback we lost changes for user1 if commit we leave user2 in inconsistent state: user updated, previous users_to_groups records deleted but new users_to_groups records are not inserted

Captain-P-Goldfish commented 9 months ago

I have adjusted the commit and made the the transactionManager an interceptor. The readOnly boolean is no longer given as parameter to the method getInterceptor previously getTransactionManager. Instead the EndpoinType enum is given as parameter that should give a little bit more of control here.

Captain-P-Goldfish commented 9 months ago

I have added your commit unmodified. Like this you will be able to see the changes in the commit after.

alexandertokarev commented 9 months ago

This solution is better than the one I suggested. Interceptor can be used not only for transaction management, but also for other purposes, such as auditing. Great work, thanks a lot!

Captain-P-Goldfish / SCIM-SDK

Add transaction support #592