holunda-io / camunda-bpm-taskpool

Library for pooling user tasks and process related business objects.
https://www.holunda.io/camunda-bpm-taskpool/
Apache License 2.0
68 stars 26 forks source link

Filtering by DataEntry payload does not work with the JPA view #942

Closed S-Tim closed 3 months ago

S-Tim commented 8 months ago

Steps to reproduce

Note: the described behavior only considers the JPA view for now. I have not investigated how the other views behave yet.

Expected behaviour

According to the docs filters that target the payload do not have a prefix (regardless of whether this concerns task payload or data entry payload). So one would expect that when a filter like foor=bar is provided it would only return tasks which have a payload property with name foo and value bar either in the task payload or in the payload of correlated data entries. Another question that comes to mind is whether the filters should consider the data entry payload in all queries. I would think that would be the expected behavior but currently they are only considered in queries that also return DataEntries (although in a wrong way).

Actual behaviour

Currently if the query does not return TasksWithDataEntriesQueryResult, for example the TasksForUserQuery then the data entries and their payload are not considered at all. I would think that even if I don't care about the data entries themselves (meaning I only want tasks) then I would still expect the filters to work on both payloads.
If queries where data entries are considered, for example TasksWithDataEntriesForUserQuery the filtering behavior is also not correct. In those cases when a filter like foor=bar is provided the tasks are filtered by whether their task payload satisfies this condition. Then the data entries for the tasks that fulfill that predicate are loaded and filtered by whether their payload satisfies the predicate as well.

Example: DataEntry D1 with payload foor=bar DataEntry D2 with payload bar=baz Task T1 with empty task payload and correlation to DataEntry D1 Task T2 with task payload foor=bar and correlation to DataEntry D1 and DataEntry D2

Query TasksWithDataEntries with filter foor=bar:

  1. Tasks are filtered by foor=bar leaving only T2
  2. T2's data entries are filter by foor=bar leaving only D2
  3. Result would be T2 with DataEntry D2

I would expect to see T1(dataEntries[D1]), T2(dataEntries[D1, D2]) I don't it is confusing to filter the DataEntry list of the tasks. I would only filter the list of tasks (also looking the the data entry payload) and then returning the tasks with all their correlated data entries.

Solution approaches

In general I think the cleanest solution would be to separate the filters for the different payloads, for example task.payload. filters on the task payload and data.payload. filters on the data entry payload. This would also massively benefit the performance in the JPA view because the join would be smaller especially if there are many payload attributes. This would of course be a breaking change for the filtering API.

A less breaking but imo a little less clean approach would be to to only specify a new prefix for data entry payload filters, for example dataPayload.* and everything else stays the same. This would leave much of the API untouched and at least for the JPA view would not really alter the current filtering behavior.

A third option would be to acutally implement the filters as described in the docs. This would require to change the specification building criteria.toTaskSpecification() and fun hasTaskPayloadAttribute(name: String, values: List<String>): Specification<TaskEntity>. Here one would have to join not only the task attributes as is done right now but also the plf_task_correlations and then the plf_data_entry_payload_attributes tables and evaluate the predicate on the result. This would keep the API the same but the performance would suffer a lot in the JPA view at least

zambrovski commented 4 months ago

We prefer a non-breaking version of just fixing the described feature (third option first)