nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
27.43k stars 4.08k forks source link

Very slow access (copy, ls, etc) files in folders with lots of files through webdav + proposed fix #8962

Closed Baltix closed 4 years ago

Baltix commented 6 years ago

Access (copy, ls, etc) files from folders with thousands of files through webdav is very slow (10-20 or more seconds for simple ls command : time ls /home/sketis/docs.mini-maxi.lt/2_Nuotrauku_katalogas/imported/RB00097504_1_Joico-Color-Endure-Conditioner-1000-ml.jpg /home/sketis/docs.mini-maxi.lt/2_Nuotrauku_katalogas/imported/RB00097504_1_Joico-Color-Endure-Conditioner-1000-ml.jpg real 0m18.519s user 0m0.000s sys 0m0.000s

(/home/sketis/docs.mini-maxi.lt/ is mounted through davfs (davfs2 package), folder /home/sketis/docs.mini-maxi.lt/2_Nuotrauku_katalogas/imported/ contains ~9000 files size 100-600kb)

I've tried lots of optimizations, including redis and database buffer, etc optimizations and almost nothing helps here, but after 3 days of hard work I've found where is the main issue:

every operation with any file from folder with thousands of files through webdav protocol generates thousands queries (one or 2 query for every file from that folder): SELECT * FROM "oc_properties" WHERE "userid" = $1 AND "propertypath" = $2 AND "propertyname" in ($3) I've noticed this with "pg_activity" tool. This query takes about 2-3 miliseconds, but for every operations nextcloud generates thousands of queries :(

This is a bug, because I access (for example copy) only one file, not all files from folder. Operation takes 2x longer, when I try to access file, which doesn't exists: time ls /home/sketis/docs.mini-maxi.lt/2_Nuotrauku_katalogas/imported/B00002 ls: cannot access '/home/sketis/docs.mini-maxi.lt/2_Nuotrauku_katalogas/imported/B00002': No such file or directory real 0m45.274s user 0m0.000s sys 0m0.000s

I've activated "pg_stat_statements" module in PostgreSQL and then found the main issue: SELECT query,total_time,max_time,min_time,calls FROM pg_stat_statements ORDER BY calls DESC fetch first 2 rows only; _query | total_time |max_time|min_time| calls ----------------------------------------------------------------------------------------+------------+--------+--------+------ SELECT * FROM ocproperties WHERE "userid"=$1 AND "propertypath"=$2 AND "propertyname" in($3) | 107636.46 | 3.54 | 1.48 |67532

67532 queries just after ~2 minutes! "oc_properties" table contains ~12000 rows. I've found, that this table only contains info about executable bit of file, so, it's not useful for majority of users!

So, there is one simple workaround - I've increased query speed 10 times by creating and index in oc_properties table for both columns - userid and propertypath:

CREATE INDEX properties_path_index ON oc_properties USING btree (userid, propertypath); (this is PostgreSQL syntax)

Please add this index in Nexcloud database creation file.

But index creation only partially solves the performance issues with lots of files in one folder - it would be wise to create the setting "Ignore executable bit in WEBDAV" in nextcloud's config.php and don't query oc_properties table at all when this setting is set to true.

Steps to reproduce

  1. Create a folder and upload 5000-9000 files with one user through webdav
  2. mount that folder with davfs
  3. Set executable bit for all files
  4. try ls, cp or rm one or several files from mounted folder (or with file manager, which supports webdav)

Expected behaviour

ls, cp or rm one or several files from folder should take 1 or 2 seconds

Actual behaviour

ls, cp or rm one or several files from folder takes 10-40 seconds

Server configuration

Operating system: Ubuntu 16.04.4 64-bit

Web server: apache (also tested with nginx - almost identical performance)

Database: PostgreSQL

PHP version: 7.0.28

Nextcloud version: 13.0.1 (also tested with all 12.x versions)

Updated from an older Nextcloud/ownCloud or fresh install: Updated from 12.x

Where did you install Nextcloud from: from nextcloud.org downloads

Nextcloud configuration:

Config report 'ocy77dpvulpn', 'passwordsalt' => 'good_password', 'secret' => 'no_secrets', 'trusted_domains' => array ( 0 => '192.168.199.53', ), 'datadirectory' => '/srv/nextcloud-duomenys', 'overwrite.cli.url' => 'https://192.168.199.53/', 'htaccess.RewriteBase' => '/', 'dbtype' => 'pgsql', 'version' => '13.0.1.1', 'dbname' => 'nextcloud', 'dbhost' => 'localhost', 'dbport' => '', 'dbtableprefix' => 'oc_', 'dbuser' => 'nextuser', 'dbpassword' => 'db-password', 'installed' => true, 'mail_smtpmode' => 'smtp', 'mail_smtpauthtype' => 'LOGIN', 'mail_smtpsecure' => 'ssl', 'mail_from_address' => 'docs', 'mail_domain' => 'my.mail.lt', 'mail_smtpauth' => 1, 'mail_smtphost' => 'my.mail..lt', 'mail_smtpport' => '465', 'mail_smtpname' => 'docs@my.mail.lt', 'mail_smtppassword' => 'mail_password', 'mail_smtpdebug' => true, 'auth.bruteforce.protection.enabled' => false, 'maintenance' => false, 'theme' => '', 'loglevel' => 2, 'memcache.local' => '\\OC\\Memcache\\APCu', 'redis' => array ( 'host' => 'localhost', 'port' => 6379, ), );

Are you using external storage, if yes which one: local

Are you using encryption: no

Are you using an external user-backend, if yes which one: no

Client configuration

Operating system: Ubuntu 16.04 64-bit

MorrisJobke commented 6 years ago

cc @rullzer @icewind1991

MorrisJobke commented 6 years ago

CREATE INDEX properties_path_index ON oc_properties USING btree (userid, propertypath); (this is PostgreSQL syntax)

Please add this index in Nexcloud database creation file.

@nickvergessen Makes sense or not?

nickvergessen commented 6 years ago

Sounds familiar

MorrisJobke commented 6 years ago

@rullzer Another one where an index could help a lot

MorrisJobke commented 6 years ago

We had a look at this and adding the index there is not that easy as it might work fine in Postgres but for mysql there is some limitation on the index length in default setups. :/

So we will move this to 15 as there is not enough time left to look into this properly.

artemanufrij commented 6 years ago

I have similar setup on my Ubuntu16.04 server (I use MariaDB) and I also mounted webdav ono my client. ls command takes ~60-120 seconds. cp (10 files a 300k) takes more than 10 minutes.

srkunze commented 5 years ago

Same issue here.

I actually intend to use the personal nextcloud server as a backup storage. Unfortunately, it's very slow. :-/

Is there some update on this one?

grantbunyan commented 5 years ago

👍 for an update on this please. Transfers via webdav are incredibly slow. 5mins for 300k.

MarioPerini commented 5 years ago

Extremely slow WebDav confirmed also from my side

BrianLeishman commented 5 years ago

Something I've noticed is that it runs the "getScanner" method on the OC\Files\View class every file every time you do anything with the local storage. Even when I'm 3 or 4 folders deep, if I reload that page, it scans every folder from the root up to where I'm at.

I've been trying to profile the folder loading (trying to figure out which function is the culprit), and it looks like if I'm at a location like this

/Local Storage Root/Brian Leishman/Pictures/Food

Then it scans every file/folder in "Local Storage Root", then "Brian Leishman", "Pictures", and finally "Food", so when "Local Storage Root" has 11k folders in it, they all get scanned again at this point, which means that accessing that last folder with 1 item in it takes about a minute to load (even on a very, very fast machine with all the performance tuning enabled).

I'm trying to fix this myself for our use case, but I'm getting stuck trying to figure out if it's the Sabre/DAV 3rd party tool or NextCloud itself

ghost commented 5 years ago

I have folders with 100k+ files and even if I mount locally with davfs2 I get very poor performance. this is pretty much a nearly killer for my installation. any advice so far. I got redis running and every possible tuning paramter i could find. I am running 16.0.1

tessus commented 5 years ago

@ybaumy I'm afraid as long as enterprise customers do not complain, nothing will happen here. While this seems to be a core issue (one of the primary functions of Nextcloud), the priority seems disproportionally low. But every time I mention that priorities are off in this project, I get my head ripped off, so I'm just going to shut up and be quiet for another few years.

ghost commented 5 years ago

@tessus

are there any useful alternatives to nextcloud + webdav?

dimm0 commented 5 years ago

None with that many plugins and apps around.. AFAIK

mantri-govind commented 5 years ago

Same issue transfer of some 700 MB took around 30 mins :( anyone have alternate ?

ghost commented 5 years ago

@mantri-govind I think there is not much that can be down, if you are looking for webdav solution, besides wait for the nextcloud guys to optimize their php code at least for listing directory content. anyways the poorly written php code + webdav is a performance killer. maybe someone who made a fork of nextcloud has addressed this issue. I wonder how Enterprise customers can accept this. I mean if you leave the community hanging somewhere it is, well not so good but understandable. if you find a fork please reply.. I had to turn off encryption completely and I am now accessing the files with other protocols.

mantri-govind commented 5 years ago

@ybaumy sad.. actually I found this issue been also reported with davfs2 package as well https://bugs.launchpad.net/ubuntu/+source/davfs2/+bug/185986 as you have mentioned not sure why would they leave issue open for so long! Coming to encryption part.. I don't have backend encryption set. Also can you pls let me know what is the other protocol we can use to configure nextcloud webdav share as mount folder !

Right now I'm not planning to fork it may be I'll go with slightly expensive approach of using rsyc and occ files:scan -all kind of solution with cron :(

PS : If you copy archive file one by one performance is quite manageable (700 MB zip took around 1 min or so).

ExaconAT commented 5 years ago

Same problem, we need to mount the admin folder via davfs2 (fstab) and ls is extremly slow @ 800 000 files / folders. Is there any solution, already tried so much tweaks.

I think it's always searching recurvsive, pretty nonsense if i make a ls for showing the dirs in the pwd.

aquarat commented 5 years ago

I created indexes as per @Baltix 's post and the speed improvement was just over 100x (1m48s to 1s) - but this is over a network and involves caching. A locally mounted share now returns from ls after 400ms. The database has ~ 700k files in oc_filecache.

Additionally, after the above, I edited apps/dav/lib/Connector/Sabre/CustomPropertiesBackend.php and nextcloud/apps/dav/lib/DAV/CustomPropertiesBackend.php by inserting return []; in the first line of the getProperties functions. This appears to be where the above problematic query originates.

I have also tried to tune the postgres instance (docker-compose format):

command: -c 'shared_preload_libraries="pg_stat_statements"' 
-c 'track_activity_query_size=2048' -c 'pg_stat_statements.track=all' 
-c 'shared_buffers=8GB' -c 'effective_cache_size=24GB' 
-c 'maintenance_work_mem=2GB' -c 'checkpoint_completion_target=0.7' 
-c 'wal_buffers=16MB' -c 'default_statistics_target=100' 
-c 'random_page_cost=1.1' -c 'effective_io_concurrency=200' 
-c 'work_mem=10485kB' -c 'min_wal_size=1GB' 
-c 'max_wal_size=2GB' -c 'max_worker_processes=8' 
-c 'max_parallel_workers_per_gather=4' -c 'max_parallel_workers=8'

Hopefully this helps someone :)

tessus commented 5 years ago

@aquarat unfortunately the database changes only work on Postgres. People with a MySQL setup are not able to apply those changes. Acoording to @MorrisJobke that was moved to 15, and now we are on 17 and the same problem still exists.

Changing the getProperties function will most likely break Nextcloud, so I doubt that this is a valid solution.

schniepp commented 5 years ago

Following up on @mantri-govind: Chances are that there is a problem in the linux davfs2 implementation. In the bug report that @mantri-govind mentions, there are two posts (#31 and #33) mentioning that the files in the same system can be accessed significantly faster using the webdav:// or dav:// protocols in KDE's konqueror and Nautilus, respectively. I can confirm this for Konqueror. It is unfortunate that nothing has been done on the davfs2 side to look into this. I am not qualified to do so.

mantri-govind commented 5 years ago

@schniepp I think solution given by @aquarat might speed up a bit. Also it should work fine even on remote machine with Konqueror/Nautilus, nextcloud client itself uses webdav. Problem start the movement we mount it as share.

schniepp commented 5 years ago

@mantri-govind: Thanks for the feedback! My setup currently uses mysql. From what I read above, the solution will only work for postgresql, correct?

aquarat commented 5 years ago

@schniepp Modifying the code will work for any database but creating indexes is database-specific. Modifying the code isn't great because it'll get overwritten during an update. Creating indexes is the best option.

Also this speed improvement works using both davfs2 on Linux (Ubuntu) and Mac OS X Finder (the primary target of my efforts) - so it isn't a davfs2 issue.

@tessus So far everything is working well with the early empty return in getProperties. There is no indication of failure or malfunction. The system in question has over 20 users using it at any time during the day.

kesselb commented 5 years ago

Sorry for Off-Topic: How do you add entries to oc_properties? Is this only possible via davfs2? I tried to set the executable bit but the nextcloud client (for linux) does not push anything? :see_no_evil:

schniepp commented 5 years ago

@aquarat: so how about @tessus' comment that inserting the return []; command will break nextcloud? Any idea how dangerous it is to do this? I did a quick test and did not see huge improvements. Not sure if a restart would be needed to do this, or if the old version of the php file is cached somewhere. But mounting the same folder via sshfs instead of davfs was still at least an order of magnitude faster. (Same server, same network connection, same client.) Just some very rough ball parking here.

bmorgenthaler commented 5 years ago

So I installed pghero on our Nextcloud system and it had a number of suggestions for indexes to add (and remove).

For example on the following query SELECT "fileid", "storage", "path", "path_hash", "parent", "name", "mimetype", "mimepart", "size", "mtime", "storage_mtime", "encrypted", "etag", "permissions", "checksum" FROM "oc_filecache" WHERE "storage" = $1 AND "path_hash" = $2 create the following index CREATE INDEX CONCURRENTLY ON oc_filecache (path_hash)

It also listed a large number of duplicate indexses: On oc_appconfig appconfig_appid_key (appid) is covered by oc_appconfig_pkey (appid, configkey) On oc_collres_accesscache collres_user_coll (user_id, collection_id) is covered by collres_unique_user (user_id, collection_id, resource_type, resource_id) On oc_credentials credentials_user (user) is covered by oc_credentials_pkey (user, identifier)

pghero also listed 50 unused indices that could be removed for faster writes to the database. fs_storage_mimetype UNUSED and fs_storage_mimepart UNUSED are two examples.

I haven't removed any of these unused indices because I didn't want to break anything but I have removed duplicate indexes are created the suggested ones. So far I've seem some improvement in the overall speed of Nextcloud when dealing with directories containing 100's of small files (<1MB). But this was all subjective, I haven't done an official speed test.

tessus commented 5 years ago

@bmorgenthaler indexes can be used for a number of reasons, e.g. as a constraint to ensure uniqueness. Without knowledge about the app and why those indexes exist, it is hard to determine anything. As for duplicate indexes it depends on the engine. Some rdbms can use a subset of a compound index (when at the beginning), some others cannot. Same is true for the order. Maybe it's a side effect of having an abstraction layer that creates the same set of tables/indexes for every rdbms supported by Nextcloud.

sir-lebowski commented 4 years ago

Exactly this problem still in 17.0.1 with MySQL. Dav client, Cyberduck on MacOS. The queries are slow and they build up with each action in the client.

sir-lebowski commented 4 years ago

Exactly this problem still in 17.0.1 with MySQL. Dav client, Cyberduck on MacOS. The queries are slow and they build up with each action in the client.

ghost commented 4 years ago

Guys. I just wanted to give you an update on my solution. I switched to minio for file storage. Since I never used much of the other nextcloud features, this switch did not really hurt. Video and voice is now jitsi at the moment at least. minio has no gui client, but I don't need that for synchronization. The speed is amazing and I cannot complain. If you are already familiar with AWS S3 storage then the switch will be easy.

ExaconAT commented 4 years ago

Guys. I just wanted to give you an update on my solution. I switched to minio for file storage. Since I never used much of the other nextcloud features, this switch did not really hurt. Video and voice is now jitsi at the moment at least. minio has no gui client, but I don't need that for synchronization. The speed is amazing and I cannot complain. If you are already familiar with AWS S3 storage then the switch will be easy.

No solution for us.

mantri-govind commented 4 years ago

@ybaumy How did u manage to mount minio on your machine ?

ghost commented 4 years ago

@mantri-govind use s3fs. create a file with your credentials.

/etc/s3creds with content of user:password

then mount with

s3fs <bucket> /s3 -o passwd_file=/etc/s3creds,use_path_request_style,url=https://miniohost.domain,allow_other,umask=0007,uid=1000

I have specified uid=1000 for my user that should have access to the files. more information can be found in the manpage

dbergeba commented 4 years ago

Following up on @mantri-govind: Chances are that there is a problem in the linux davfs2 implementation. In the bug report that @mantri-govind mentions, there are two posts (#31 and #33) mentioning that the files in the same system can be accessed significantly faster using the webdav:// or dav:// protocols in KDE's konqueror and Nautilus, respectively. I can confirm this for Konqueror. It is unfortunate that nothing has been done on the davfs2 side to look into this. I am not qualified to do so.

I can 100% confirm that

dkocher commented 4 years ago

Exactly this problem still in 17.0.1 with MySQL. Dav client, Cyberduck on MacOS. The queries are slow and they build up with each action in the client.

We have documented this in our wiki.

dimm0 commented 4 years ago

Exactly this problem still in 17.0.1 with MySQL. Dav client, Cyberduck on MacOS. The queries are slow and they build up with each action in the client.

We have documented this in our wiki.

Could index be included in nextcloud by default?

ezaton commented 4 years ago

Large directory 'ls' when using WebDav (davfs2) on Linux results in specific query for each file, like this: SELECT * FROM oc_properties WHERE userid = 'yossi' AND propertypath = 'files/yossi/Desktop-PC/Documents/Samsung-Backup/Camera/IMG_0025.JPG' AND propertyname in ('{DAV:}creationdate', '{http://apache.org/dav/props/}executable')

now - I have about 5K of these queries, just for file listing with 'ls'. There must be some better way of handling this data. I do not know what it is, but the current method is too slow.

Edit: The only (partial) workaround I have, which does not scale, is to increase PHP max execution time. It was 60 seconds, and then increased to 300, and most directory listings worked fine, but then there is another one which does not, and I have just tried with 600 seconds timeout. This solution doesn't scale...

Edit2: Did not solve the problem. Web interface works fast, BTW on the same directory - 2-3 seconds wait, and that's it.

Edit3: I have updated davfs2 on Ubuntu (compiled 1.5.5), for no avail. Increased davfs2 timeout from 300 to 600. Doesn't look bright at all. Seems like the DB query mentioned above is performed for each file in this directory (~5300 of them) at a rate of 15+- queries per second - so I was able to calculate that a result would be provided withing ~5.5 minutes. However, it seems as if the DB is running the same queries again, for the same files!

Edit4: It worked. Took 13.5 minutes for 5332 files listing. Each DB query was taken twice. I think that the DB should have an entry for each directory (mark it by a hush of its ${USER_ROOT} relative path, or something), and this hash should be the key for all files inside this directory (and other directories as well), each holding the webdav metadata. It should have been a very fast query, but it became a very slow one.

KekcuHa commented 4 years ago

I use both suggested workarounds:

  1. additional MySQL index as documented in https://trac.cyberduck.io/wiki/help/en/howto/nextcloud#Slowlistingsforlargefolderstructures
  2. tweaks for davfs2 config file: use_locks 0 cache_size 8192 table_size 8192 dir_refresh 7200 file_refresh 3600 delay_upload 30
schniepp commented 4 years ago

@KekcuHa — what kind of improvements are you seeing? Someone above suggested that this only works for Postgres, with reservations about MySQL. This is way above my head, so I am hesitant to do this on my production system as I do not want to mess up my DB. If I add the index, could it be removed again without doing any damage?

KekcuHa commented 4 years ago

@schniepp — I noticed a significant acceleration for directories browsing and files operations performed at mounted with davfs2 remote resources. Index does not affect the data stored in the table, it just speeds up query time, so it can be safely removed.

ezaton commented 4 years ago

Wow! Only the indexes (MySQL) resulted in a huge performance and response change! During 'rsync' operations (The preferred backup method, since the Linux agent kills files with weird names) the server's CPU dropped from 1-1.5 cores (100-150% CPU) to about 3%, Apache CPU has gone up (this is good - it means it is used to transfer data), and the speed change is amazing. Wow. I really thing this change should be incorporated into NextCloud. Amazing change, at a zero cost.

szaimen commented 4 years ago

I really thing this change should be incorporated into NextCloud. Amazing change, at a zero cost.

One question remains if this will help on all databases or at least if it doesn't worsen the speed on other databases...

schniepp commented 4 years ago

I just implemented the MySQL indices following @KekcuHa 's recommendation, and I changed the nextcloud code following @aquarat . However, the KDE implementation of webdavs:// is still lightyears faster when accessing the very same directories (same server, same client). I didn't test this quantitatively, but I want to say the difference is at least one order of magnitude.

I am not sure if the changes made a difference. I restarted MySQL and apache2 services. Does anything else need to be restarted for these changes to go into effect? Perhaps, the recommended changes only make a big difference for directories containing many files and folders? In my case, there are, perhaps, only 10-20 files and folders in a typical directory, and I still see significant delays.

hostingnuggets commented 4 years ago

If it is as simple as adding the following SQL index for MySQL:

CREATE INDEX properties_path_index ON oc_properties(userid, propertypath)

as explained here: https://trac.cyberduck.io/wiki/help/en/howto/nextcloud#Slowlistingsforlargefolderstructures

How many years do we need to wait to get this implemented? This issue is opened since 2 years and WebDAV like that is unusable: super slow on the client side and using a lot of resources on the server side for nothing.

mario commented 4 years ago

Fixed via https://github.com/nextcloud/server/pull/20716.

Sorry for the delay people!