kitodo / kitodo-production

Kitodo.Production is a workflow management tool for mass digitization and is part of the Kitodo Digital Library Suite.
http://www.kitodo.org/software/kitodoproduction/
GNU General Public License v3.0
63 stars 63 forks source link

Slow process template page access #5336

Open henning-gerhardt opened 2 years ago

henning-gerhardt commented 2 years ago

Describe the bug The first access of the process template page is slow and there are only 10 process templates to be shown - i guess the time for this is 30 seconds. I don't know if the time depends on how many projects and indirect processes are assigned to a process template or which other reason is responsible to this slow displaying. If you later - or in the same login session - open the process template page again, then you get the results a lot faster. After you logged out and logged in again the display speed is on the first access slow again.

To Reproduce Steps to reproduce the behavior:

  1. Create some process templates and assign them to processes
  2. Logout / Login
  3. Open the process template page
  4. Wait a long time to see all process templates

Expected behavior Displaying of process templates should not take "ages" to get displayed.

Release 3.4.4-SNAPSHOT

BartChris commented 2 years ago

It seems like the function isTemplatUsed is called four times for every template displayed in the templateList-xhtml, e.g.

https://github.com/kitodo/kitodo-production/blob/f58592bfe58d58bfc6aa72399574cbcfd3102e65/Kitodo/src/main/webapp/WEB-INF/templates/includes/projects/templateList.xhtml#L84

https://github.com/kitodo/kitodo-production/blob/f58592bfe58d58bfc6aa72399574cbcfd3102e65/Kitodo/src/main/webapp/WEB-INF/templates/includes/projects/templateList.xhtml#L107

Which would probably mean 40 database (Elasticsearch?) queries in your case:

https://github.com/kitodo/kitodo-production/blob/f58592bfe58d58bfc6aa72399574cbcfd3102e65/Kitodo/src/main/java/org/kitodo/production/forms/TemplateForm.java#L414

henning-gerhardt commented 2 years ago

Thank you for your research, @BartChris . I can more or less confirm that the call in TemplateForm class is causing a lot of delay while displaying the result list. Maybe is using ElasticSearch for such kind of queries not a good solution as first all search results (or at least of a maximum of 10000) are returned back and transformed into Java objects and then only to decide is the returned list empty or not. If ElasticSearch must be used then should the result should be reused in the xhtml file instead of starting the query over and over again but reusing the result should be considered in any case independent where the search is done.

BartChris commented 2 years ago

I am wondering if this flag ("is_used_by_processes") should just be a property of the template, set to NULL for existing templates. Then this has to be only calculated once and then stored in the database. Whenever a process is deleted it could be checked, if the template is still referenced by a process and the property could be changed if necessary. This would duplicate information which is already contained in the database, but it would speed up the use case.

henning-gerhardt commented 2 years ago

If getting this information from the database (template is used by processes or not) instead of ElasticSearch you did not need such a field as this information can be retrieved through a good query (left or right join, depending on which side you start the query) to the database - which is possible through Hibernate as used database layer too.