nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.44k stars 4.08k forks source link

[Bug]: Unified Query Allows Unwieldy SQL Statement With External Storages #38653

Open daveatpinellas opened 1 year ago

daveatpinellas commented 1 year ago

⚠️ This issue respects the following points: ⚠️

Bug description

If you have Nextcloud connected to multiple external storage devices, when you use the unified search feature it's generating a SQL statement with lots of OR statements and takes too long to complete. The CPU is pegged during this query and it continues even if you click focus out of the UI or navigate to another part of Nextcloud.

In some way probably there should be a throttle on allowing this many external storage connections in the SQL. Possibly it should break this into smaller pieces and run them in a loop to return the data.

Here is the statement that was generated, and this is taking in some cases 10 minutes to complete:

SELECT file.fileid, storage, path, path_hash, file.parent, file.name, mimetype, mimepart, size, mtime, storage_mtime, encrypted, etag, permissions, checksum, unencrypted_size FROM oc_filecache file LEFT JOIN oc_vcategory_to_object tagmap ON file.fileid = tagmap.objid LEFT JOIN oc_systemtag_object_mapping systemtagmap ON (file.fileid = systemtagmap.objectid) AND (systemtagmap.objecttype = 'files') LEFT JOIN oc_vcategory tag ON (tagmap.type = tag.type) AND (tagmap.categoryid = tag.id) AND (tag.type = 'files') AND (tag.uid = '#############') LEFT JOIN oc_systemtag systemtag ON (systemtag.id = systemtagmap.systemtagid) AND (systemtag.visibility = '1') WHERE ((tag.category  COLLATE utf8mb4_general_ci LIKE '%smith%') OR (systemtag.name  COLLATE utf8mb4_general_ci LIKE '%smith%')) AND (((storage = 12) AND ((path = 'files') OR (path LIKE 'files/%'))) OR (storage = 44) OR (storage = 47) OR (storage = 48) OR (storage = 50) OR (storage = 51) OR (storage = 54) OR (storage = 55) OR (storage = 62) OR (storage = 161) OR (storage = 205) OR (storage = 236) OR (storage = 236) OR (storage = 257) OR (storage = 294) OR (storage = 294) OR (storage = 341) OR (storage = 361)) ORDER BY mtime + '0' desc LIMIT 5

Steps to reproduce

  1. Log into an account with lots of external storage connections
  2. Click into the unified search box and enter a search
  3. Using top, watch the huge load on the server

Expected behavior

In some way, the sql statement should have a level of reasonableness that prohibits this many external storage folders to be part of the query. Or send them one after the other in a loop.

Installation method

None

Nextcloud Server version

26

Operating system

None

PHP engine version

None

Web server

None

Database engine version

None

Is this bug present after an update or on a fresh install?

None

Are you using the Nextcloud Server Encryption module?

None

What user-backends are you using?

Configuration report

No response

List of activated Apps

Enabled:
  - activity: 2.18.0
  - bruteforcesettings: 2.6.0
  - circles: 26.0.0
  - cloud_federation_api: 1.9.0
  - comments: 1.16.0
  - dav: 1.25.0
  - federatedfilesharing: 1.16.0
  - federation: 1.16.0
  - files: 1.21.1
  - files_accesscontrol: 1.16.0
  - files_automatedtagging: 1.16.1
  - files_external: 1.18.0
  - files_retention: 1.15.0
  - files_rightclick: 1.5.0
  - files_sharing: 1.18.0
  - files_trashbin: 1.16.0
  - files_versions: 1.19.1
  - impersonate: 1.13.1
  - logreader: 2.11.0
  - lookup_server_connector: 1.14.0
  - oauth2: 1.14.0
  - photos: 2.2.0
  - privacy: 1.10.0
  - provisioning_api: 1.16.0
  - related_resources: 1.1.0-alpha1
  - serverinfo: 1.16.0
  - settings: 1.8.0
  - sharebymail: 1.16.0
  - support: 1.9.0
  - systemtags: 1.16.0
  - theming: 2.1.1
  - twofactor_backupcodes: 1.15.0
  - user_ldap: 1.16.0
  - viewer: 1.10.0
  - workflowengine: 2.8.0

Nextcloud Signing status

No response

Nextcloud Logs

No response

Additional info

No response

szaimen commented 1 year ago

Cc @icewind1991

XueSheng-GIT commented 1 year ago

Similar to https://github.com/nextcloud/server/issues/35776. For each shared file, another OR statement is required and search really gets slow. Maybe also realated: https://github.com/nextcloud/server/pull/37061

kesselb commented 1 year ago

Do you see a better execution time when you combine the OR (storage = x) query parts to OR (storage IN (x, y, z)) ?

XueSheng-GIT commented 1 year ago

@kesselb How could that be done? The search query is generated by nextcloud and includes (in my case) around thousand storages.

kesselb commented 1 year ago

@XueSheng-GIT you take one query from the slow query log, modify it manually and execute it ;) If we see a improvement we can think about a patch.

cra-taketomi-yo commented 1 year ago

The phenomenon that the CPU utilization rises when searching with the unified search feature has also been confirmed in version 25.0.6.

XueSheng-GIT commented 1 year ago

@kesselb maybe you want to have a second look at https://github.com/nextcloud/server/pull/37061 again. Search execution time did improve tremendously.

pjft commented 1 year ago

Can you test whether this helps?

https://github.com/nextcloud/server/issues/38243#issuecomment-1552714039