holunda-io / camunda-bpm-taskpool

Library for pooling user tasks and process related business objects.
https://www.holunda.io/camunda-bpm-taskpool/
Apache License 2.0
67 stars 26 forks source link

JPA Data View: DataEntryEntity has too many eager to-many associations #827

Closed lbilger closed 1 year ago

lbilger commented 1 year ago

Steps to reproduce

Expected behaviour

Data entry updates are always processed quickly.

Actual behaviour

When data entries get updated a lot, they accumulate a lot of protocol entries (about 1700 in our case, to give a feeling for the order of magnitude where we are experiencing this issue). Event processing becomes very slow then, with the event processor falling behind the others under a little load. Data entry updates take minutes to be processed by the JPA view.

Debugging this, I found that the protocol list of the DataEntryEntity had more then 20.000 entries after loading from the database, containing each protocol entry that actually existed 13 times. This is due to eager initialization of the payloadAttributes, authorizedPrincipals and protocol collections with fetch joins. The database returns the cartesian product of all the joined tables, with in our case 1 authorizedPrincipal and 13 payloadAttributes. protocol is mapped as a List instead of a Set, so all those (13 * 1700) rows are added to the collection. This seems to make processing very slow, either due to the huge amount of data transferred from the database or due to processing of the large list, I don't know which it is.

Adding the annotation @Fetch(FetchMode.SELECT) to protocol fixes the performance issue for us. Unfortunately, that's a vendor-specific annotation from Hibernate, but it seems we are locked to Spring and thereby Hibernate anyway.

I'm not sure, though, if this deteriorates performance on the Query side, when lots of DataEntries with short protocols are queried. We could consider adding the @BatchSize annotation as well to mitigate this. But maybe the fetch mode and fetch type defined on the entity are overridden by the Criteria/Specification query anyway?

lbilger commented 1 year ago

I also tried out changing the type of the protocol collection to a Set and leaving out the @Fetch annotation, but then the processor started falling behind again. So it seems to be the loading of so many rows that requires so much time.

Not directly related, it might also be a good idea to have some setting to optionally truncate the protocol if it becomes very long.