Closed S-Tim closed 11 months ago
After some more investigation, it is wrong that only one join is needed. One join per attribute is needed. That means for filters customer=ABC
customer=DEF
customer=GHI
foo=bar
two joins are needed. One for customer and one for foo. The DB optimizes these multiple joins because the criteria are AND-composed and can therefore be short-circuited.
In the current implementation a join is also performed for every OR on the value of an attribute. To evaluate this OR, all the joins have to be performed which causes massive performance hits. This could be mitigated by only joining once for per attribute and then using OR or IN in the query.
Steps to reproduce
Expected behaviour
In order to filter based on payload attributes one join between the
plf_task
andplf_task_payload_attributes
tables has to be performed. On this join result all the filter criteria for payload attributes can be evaluated.Actual behaviour
Each filter criterion is transformed into a specification by itself
and for each of these criteria a join is performed
This results in the following pattern in the generated SQL
In the above example two filter criteria on the task payload were given, which results in consecutive joins of the task and payload table.
(plf_task join plf_task_payload_attributes) join plf_task_payload_attributes)
This means that the resulting joined table grows exponentially with the number of filters, even though only one single join of these tables would be necessary.
One way to eliminate this would be to build the specification from the whole list of criteria on the payload and only joining once there. The question is how the signature for that method would look like. Because we wouldn't really want to determine the composition logic in the method so we would have to find a way to represent the composition in the parameters to make the conversion to the JPA specification more generic.