camunda / camunda-bpm-platform

Flexible framework for workflow and decision automation with BPMN and DMN. Integration with Quarkus, Spring, Spring Boot, CDI.
https://camunda.com/
Apache License 2.0
4.11k stars 1.55k forks source link

Re-caching of BPMNs creates duplicate Candidate Starter identity Links and Identity Link Logs #3120

Closed StephenOTT closed 1 year ago

StephenOTT commented 1 year ago

Environment (Required on creation)

Quarkus / Spring, etc ( core engine functionality)

Tested on a 7.13 engine and a 7.18 engine

Description (Required on creation; please attach any relevant screenshots, stacktraces, log files, etc. to the ticket)

Identity Links (ACT_RU_IDENTITYLINK) and Identity Link Logs (ACT_HI_IDENTITYLINK) are inserted into the db when definitions are being re-loaded into the cache.

By default the cache is 1000 definitions (per the docs). But if you have more than 1000 definitions, the cache will evict old definitions for the newly required definitions (per the docs) https://github.com/camunda/camunda-docs-manual/blob/f1b56f62cd9afd50c43d1057d66bb867d2689ccc/content/user-guide/process-engine/deployment-cache.md

When already deployed BPMNs are re-cached (they were evicted from the cache and now are required again for some query (such as List Tasks command which does a execution.getProcessDefinition().getKey() type of action) the re-caching action generates duplicate candidate starter identity links and link logs.

Steps to reproduce (Required on creation)

  1. Set the cache capacity to 1:
@ApplicationScoped
public class CustomCacheCapacityConfig extends QuarkusProcessEngineConfiguration {
    public CustomCacheCapacityConfig() {
        setCacheCapacity(1);
    }
}
  1. Create two deployments on startup. Your BPMN would have Starter Users. Example: assume your user is stephen, This will generate 2 identity links for the Candidate user, 1 link for each BPMN.
@ApplicationScoped
public class Deployments {

    @Inject
    RepositoryService repositoryService;

    public void createDeployment1(@Observes CamundaEngineStartupEvent event) {
        repositoryService.createDeployment()
                .name("deployment-1")
                .enableDuplicateFiltering(true)
                .addClasspathResource("com/github/stephenott/myBpmn.bpmn")
                .deploy();

        repositoryService.createDeployment()
                .name("deployment-2")
                .enableDuplicateFiltering(true)
                .addClasspathResource("com/github/stephenott/myBpmn2.bpmn")
                .deploy();
    }

}

Take note on the order of the deployments. MyBpmn is deployed first. When MyBpmn2 is deployed the MyBpmn would be dropped from the cache.

  1. Now Start Instance on the MyBpmn. The engine will re-cache MyBpmn and drop MyBpmn2 and generate additional Identity Links and Link Logs for the newly cached MyBpmn

  2. Now Start Instance on the MyBpmn2. The engine will re-cache MyBpmn2 and drop MyBpmn and generate additional Identity Links and Link Logs for the new cached MyBpmn2.

Observed Behavior (Required on creation)

The RU Deployments table still only has 2 Deployments. The RU Definitions table still only has 2 Definitions.

The Identity Link Table ends up looking like this:

image

The Link Log table looks like this:

image

If you change the cache capacity to 2 so both BPMNs are cached, and rerun the scenario above, only 2 identity links and link logs are generated:

image

Expected behavior (Required on creation)

Expected behaviour is Identity Links and link logs are ONLY created on an actual deployment. Re-caching should not be generating new identity links and link logs.

The Identity Link and Identity Link Log tables should have only 2 rows in the example above.

This has not been tested on other scenarios such as DMNs, Task Candidates, Assignees, etc.

Root Cause (Required on prioritization)

The identity links for candidate users or groups are added always no matter if the process definitions were already persisted or not: https://github.com/camunda/camunda-bpm-platform/blob/master/engine/src/main/java/org/camunda/bpm/engine/impl/bpmn/deployer/BpmnDeployer.java#L147-L148 The other cases (decision definitions and tasks) look fine on first glance, please double check.

Solution Ideas

Hints

Links

Breakdown

- [ ] https://github.com/camunda/camunda-bpm-platform/pull/3224
yanavasileva commented 1 year ago

Hi @StephenOTT,

Thank you for reaching out to us with this. How did you find about this situation, what is your "normal" load of definitions?

I will have a look at the report and get back to you. Please note that there might be some delay in the processing due to other team's responsibilities. Thank you for your understanding.

Best, Yana

StephenOTT commented 1 year ago

@yanavasileva this re-inserting/duplicating of candidate starters becomes apparent when working in a large number of engine instances (containers) env that each manage their own cache (no distributed cache) and there are more definition deployments then there is space in the cache.

Example: The cache is default to 1000. But you have 1500 definitions.

StephenOTT commented 1 year ago

@yanavasileva any movement on this? Can you confirm this is a bug?

yanavasileva commented 1 year ago

@StephenOTT I was able to confirm the bug and updated relevant information in the description. Now I will send the report for a decision. Are you interested in providing pull request on the topic?

yanavasileva commented 1 year ago
Test case

```java package org.camunda.bpm.engine.test.api.cfg; import static org.assertj.core.api.Assertions.assertThat; import org.camunda.bpm.engine.ManagementService; import org.camunda.bpm.engine.RepositoryService; import org.camunda.bpm.engine.RuntimeService; import org.camunda.bpm.engine.TaskService; import org.camunda.bpm.engine.impl.cfg.ProcessEngineConfigurationImpl; import org.camunda.bpm.engine.runtime.ProcessInstance; import org.camunda.bpm.engine.test.util.ProcessEngineBootstrapRule; import org.camunda.bpm.engine.test.util.ProcessEngineTestRule; import org.camunda.bpm.engine.test.util.ProvidedProcessEngineRule; import org.camunda.bpm.model.bpmn.Bpmn; import org.camunda.bpm.model.bpmn.BpmnModelInstance; import org.junit.Before; import org.junit.ClassRule; import org.junit.Rule; import org.junit.Test; import org.junit.rules.RuleChain; public class DeploymentCacheCfgTest { @ClassRule public static ProcessEngineBootstrapRule cacheFactoryBootstrapRule = new ProcessEngineBootstrapRule(configuration -> { // apply configuration options here configuration.setCacheCapacity(1); }); protected ProvidedProcessEngineRule cacheFactoryEngineRule = new ProvidedProcessEngineRule(cacheFactoryBootstrapRule); protected ProcessEngineTestRule testRule = new ProcessEngineTestRule(cacheFactoryEngineRule); @Rule public RuleChain ruleChain = RuleChain.outerRule(cacheFactoryEngineRule).around(testRule); RepositoryService repositoryService; ProcessEngineConfigurationImpl processEngineConfiguration; RuntimeService runtimeService; TaskService taskService; ManagementService managementService; public static final BpmnModelInstance ONE_TASK_PROCESS = Bpmn.createExecutableProcess("Process") .startEvent("startEvent") .userTask().name("User Task") .endEvent("endEvent") .done(); public static final BpmnModelInstance ONE_TASK_PROCESS2 = Bpmn.createExecutableProcess("Process2") .startEvent("startEvent") .userTask().name("User Task") .endEvent("endEvent") .done(); @Before public void initialize() { repositoryService = cacheFactoryEngineRule.getRepositoryService(); processEngineConfiguration = cacheFactoryEngineRule.getProcessEngineConfiguration(); runtimeService = cacheFactoryEngineRule.getRuntimeService(); } @Test public void shouldNotAddIdentityLinksAfterRecache() { org.camunda.bpm.model.bpmn.instance.Process model = ONE_TASK_PROCESS.getModelElementById("Process"); model.setCamundaCandidateStarterUsers("demo"); model = ONE_TASK_PROCESS2.getModelElementById("Process2"); model.setCamundaCandidateStarterUsers("demo2"); repositoryService.createDeployment() .name("deployment-1") .enableDuplicateFiltering(true) .addModelInstance("process.bpmn", ONE_TASK_PROCESS) .deploy(); repositoryService.createDeployment() .name("deployment-2") .enableDuplicateFiltering(true) .addModelInstance("process2.bpmn", ONE_TASK_PROCESS2) .deploy(); // when ProcessInstance processInstance1 = runtimeService.startProcessInstanceByKey("Process"); ProcessInstance processInstance2 = runtimeService.startProcessInstanceByKey("Process2"); // then assertThat(repositoryService.getIdentityLinksForProcessDefinition(processInstance1.getProcessDefinitionId()).size()) .isEqualTo(1); assertThat(repositoryService.getIdentityLinksForProcessDefinition(processInstance2.getProcessDefinitionId()).size()) .isEqualTo(1); } } ```

StephenOTT commented 1 year ago

@yanavasileva would a fix be to implement similar logic as the Start Message subscriptions? There is logic in the parser that checks if the subscription already exists and if yes, then do not create another subscription (which basically means a a SELECT on each re-cache/parse) or should identity links only be generated on original deployment?

from our initial review, there seems to be a few different variations on a fix depending on the when actions are supposed to occur

Is there a defined spec on when a process definition MUST be in the cache? Example: In order to get a Definition Start Form, it must be pulled from the cache (per the start from CMD).

But we originally found this issue when querying for user tasks: "give me my user tasks that i am assigned to / am a candidate user/group". This seems to also trigger the need for the process definition to be cached (as there is a cmd somewhere that is going ".getProcessDefinition".

yanavasileva commented 1 year ago

Hi @StephenOTT,

The fix would be to insert the identity links only if the deployment is new. In the described use case, the identity links do not change after the re-caching so no further action is needed. Let us know if you are interested in providing a fix as pull request.

Is there a defined spec on when a process definition MUST be in the cache?

The closes thing I can find as a definition is the docs:

All process definitions are cached (after they have been parsed) to avoid polling the database every time a process definition is needed and because process definition data doesn’t change. This reduces the latency of referencing the process definitions and thus improves the performance of the system.

Source: https://docs.camunda.org/manual/7.18/user-guide/process-engine/deployment-cache/

Whenever we can avoid querying form the database, we use the cache. Same goes for the start form example, the form cannot change since the deployment was cached. In contrary, the identity links do change after the deployment so they need to be fetched from the database. Disclaimer, there might be places in the engine that above is not followed of course.

Please let me know in case of any questions.

Best regards, Yana

yanavasileva commented 1 year ago

The fix will be shipped with the upcoming 7.19.0 minor release scheduled for next month.