Closed de-jcup closed 4 years ago
Whenever you got something like
package com.daimler.sechub.domainA;
class ServiceX{
@IsSendingAsynchronousEvent(EVENT_ID1)
@Transactional
public void myServiceMethod(){
repositoryA.changePersistedData();
...
eventBus.sendEvent(EVENT_ID1);
}
}
package com.daimler.sechub.domainA;
class ServiceY{
@IsReceivingAsynchronousEvent(EVENT_ID1)
public void receiveEvent1(){
Data d = repositoryA.loadPotentialChangedPersistedData();
doSomethingDependendingOn(d);
}
}
This can reach to race conditions!
Reason: The transaction in ServiceX has not yet finished while asynchronous event is triggered and will run. When the event is handled fast enough in ServiceY it can happen that the transaction in ServiceX was not finished and we load old state in ServiceY.
This is an odd situation not only in delete but also in any situation where the event is processed in same domain as it was started (because data is not inside event but is reloaded from same database/table and so can have race conditions)
For this reason, I have to revise my previous assessment and create a new milestone + release 0.13.1 that contains the hotfix.
package com.daimler.sechub.domainA;
@AutoWired
ServiceTransactionalX serviceTransactional;
class ServiceX{
@IsSendingAsynchronousEvent(EVENT_ID1)
public void myServiceMethod(){
serviceTransactional.myTransactionalServiceMethod()l;
eventBus.sendEvent(EVENT_ID1);
}
}
package com.daimler.sechub.domainA;
class ServiceTransactionalX{
@Transactional
public void myTransactionalServiceMethod(){
repositoryA.changePersistedData();
}
}
package com.daimler.sechub.domainA;
class ServiceY{
@IsReceivingAsynchronousEvent(EVENT_ID1)
public void receiveEvent1(){
Data d = repositoryA.loadPotentialChangedPersistedData();
doSomethingDependendingOn(d);
}
}
Description
ProjectDeleteScenario3IntTest#super_admin_deletes_project__deletes_also_access_entries_other_domains_and_user_rolecalculation_is_done Line :58 fails on integration tests.
This is something that happens very often at GitHub Actions, but very seldom at Jenkins or local machines.
At the beginning I thought it was just a flaky test and the reason would be only because of taking too much time until event was handled and test just not waited enough time. But... looking into the logs there seems to be something wrong (maybe a race condition) at project delete:
Local (test working):
GitHub Actions build (test failed):
TODO
At the moment this seems to happen only at a project delete actions. And role "owner" is not really used at the moment . It has no influences to user or administrator role So it's currently not having so much impact - means no hotfix release but just in next one.
But we must analyse why this happens.