leih-lokal / LeihLokalVerwaltung

Easy management of products, rentals and customers for your Library of Things ("Leihladen")
MIT License
7 stars 1 forks source link

Query performance analysis #521

Open muety opened 2 months ago

muety commented 2 months ago

Analyses

Users page search

When searching for a keyword (e.g. "bob") on the users page, three Mango queries are issued. The second and third only run after the first one has completed, but at least run in parallel then.

Use case: searching for a name

  1. query: find (at max 25) users that match the search string (~ 400 ms)
  2. query: count all users that match the search string to populate pagination control (~ 690 ms)
  3. query: count rentals for the users returned in query 1 to populate "Ausleihen gesamt" column (~ 2850 ms)

users_query1_example.json

users_query2_example.json

users_query3_example.json

Items page search

When searching for a keyword (e.g. "scanner") on the users page, three Mango queries are issued. The second and third only run after the first one has completed, but at least run in parallel then.

Use case: searching for an item name

  1. query: find (at max 25) items whose name, brand, item type or synonyms match the search string (~ 270 ms)
  2. query: count all items that match the criteria from query 1 to populate pagination controls (~ 330 ms)
  3. query: count rentals for the items returned in query 1 to populate the "Anzahl Ausleihen" column (~ 2400 ms)

Rentals page seach

When searching for a keyword (e.g. "bohrmaschine") on the rentals page, four Mango queries are issued. The second, third and fourth only run after the first one has completed, but at least run in parallel then.

Use case: searching for a rental by item name

  1. query: find (at max 25) rentals whose item name or customer name match the search string (~ 2000 ms)
  2. query: count all rentals that match the criteria from query 1 to populate pagination controls (~ 2000 ms)
  3. query: find the item highlight colors for the rentals returned by query 1 (~ 440 ms)
  4. query: find the customer highlight colors for the rentals returned by query 1 (~ 770 ms)

For ideas on optimizing query 1, see #516.

Optimizations

Option 1

At least in the case of users- and items search, query 3 makes up the majority of total request time. We might either get rid of it entirely by simply dropping those optional columns or making them "lazy-load" (e.g. "Click to see details ..."). Or we might optimize the query itself.

The major issue here is that CouchDB can't utilize indexes for $or- or $in queries. However, we might benefit from this "poor man's indexing" hack described here.

If we rewrite query 3 for users- and items page (they're analogous to each other) from an array of $ors to using $in and then placing additional filters for an ID range, querying might be sped up vastly in some cases (not all, though).

For example, the following query can be rewritten as shown to run ~ 6x as fast. In case only one single user (or item) matches, speed up can be even higher, i.e. from seconds to < 100 ms.

Before: query_opt1_prev.json

Screenshot from 2024-09-02 22-50-13

After: query_opt1_after.json

Screenshot from 2024-09-02 22-52-20

However, in a case where the result IDs are far apart (e.g. [1, 9999]), no speed up will be achieved at all.

Option 2

...

Option n

...

@skjerns @daniel17903 @ColofoniusRegenschein Jfyi

ColofoniusRegenschein commented 2 months ago

Hey @muety would it be possible to make a cronjob of the 3rd query for users and items that runs daily at late evening, as these infos are not changing more than once daily?

muety commented 2 months ago

Yes, totally possible, but imo more effort to realize than lazy-loading them on the fly or optimizing the query as described above.

ColofoniusRegenschein commented 2 months ago

Now that I'm thinking more about it. Having the number of rentals for each user isn't really a neccessary info. Maybe we could drop it - at least temporary? Having the number of rentals for each item is only used when deciding to remove items to sell them on the fleamarket.

We could easily remove it temporary to speed things up and work on it.

muety commented 2 months ago

Having the number of rentals for each item is only used when deciding to remove items to sell them on the fleamarket.

I think it also gives a bit of guidance when deciding about which of many articles (e.g. drills) to rent out to the user if they don't have a specific preference. High rental number has been sort of an indicator for good "quality" of the item for me.

I'll probably (temporarily) remove the column on the users page, but keep it on the items page and try to "lazy-load" it instead.

Before #514 we already used to have "lazy loading" of these columns, but in a very inefficient way though (one request per table cell). I refactored it to just be a single request, but at the cost of the request becoming a blocking one.