Closed danimo closed 3 years ago
@icewind1991 just another case for profiling webdav - please have a look - THX
Note that 1.8 has recursive PROPFIND disabled, this issue only affects 1.7.x
Sabre also creates the XML Reply in memory, maybe this could be streamed to the client instead..
Sabre also creates the XML Reply in memory, maybe this could be streamed to the client instead..
@evert ?? THX
Congrats on finding the exact reason I didn't want this feature in the first place, and it's now disabled by default ;). It basically enabled a DDOS attack fairly easily. Do a few of these PROPFINDs in parallel, and the entire server grinds to a halt.
However, streaming xml is coming. This requires two things:
1 -> Completion and integration on the 'xml-rewrite2' branch. This is a project I've been on and off working on for about 2 years now and actually getting pretty close to integration.
Pull request here: https://github.com/fruux/sabre-dav/pull/602
This spawned a separate project sabre/xml which I'm also working on getting to a full release:
Most of the time when creating the response for this is not just spent writing xml, the most important thing is that the entire WebDAV tree needs to be traversed on the server. Most of the work is in fetching every node, checking access control, etc...
To properly enable streaming support, we need to be able to get the results as they come in and create the xml on the fly.
Doing this is scheduled after a next major sabre/dav release, as PHP 5.5 is required for full generator support, and I don't want to do a half-baked version of this with Iterators first.
However, none of this will really affect the total time spent (7:20) as the exact same processing is still happening. The difference is that there's lower memory usage and the first byte returns a lot quicker.
To reduce the total time spent, you should still do some proper profiling as @DeepDiver1975 said.
Why is recursive PROPFIND needed in the first place ? Is this mostly for the initial sync ?
Let me think: Once everything is synced, a PROPFIND on the root should be enough to compare etags. If the etag is different, then need to do a PROPFIND to get the folder content and see which etag changed there.
Only if you talk to broken servers. An ETag shouldn't say anything about the contents of a collection, so you shouldn't rely on this behavior if you are trying to build a standard client.
Instead, you can use the ctag (or better yet) use the sync-collection.
Can we use our existing etags as ctags ? :wink: (whatever that is, need to look it up)
Here was a result from blackfire.io I did when testing with Sabre 2.1, it was a PROPFIND with depth 1 on a folder with 69 entries: https://blackfire.io/profiles/6f6c0033-da1e-43d0-b886-b5eeeb62286c/graph
Ctag is also non-standard, but it's well defined:
The problem with using the ETag is that it does have a specific definition, and that definition is incompatible with how you are using the etag.
So while this ctag was meant for calendars, it makes sense for any collection.
However, sync-tokens are even better. Instead of doing a full PROPFIND every time, you literally ask for all changes since the last sync-token you received which can make this incredibly fast.
Description: The CS:getctag property allows clients to quickly
determine if the contents of a calendar or scheduling Inbox or
Outbox collection have changed since the last time a
"synchronization" operation was done. The CS:getctag property
value MUST change each time the contents of the calendar or
scheduling Inbox or Outbox collection change, and each change MUST
result in a value that is different from any other used with that
collection URI.
That sounds like our etags. When the collection content changes (or any subcollection), the etag will change.
So would it just be a matter of renaming "getetag" to "getctag" ?
Last changes since the last sync means the server needs to remember the states that it returned for every sync client so it can diff it, which cannot be done with the current architecture/data model. It was suggested by @icewind1991 here: https://github.com/owncloud/core/issues/4936 (speed up by keeping etag history)
Implementing this:
http://tools.ietf.org/html/rfc6578
Will basically solve any performance problems you have today of this kind. It will greatly reduce memory usage and cpu usage on both server and client. You literally just get a list of deletions and additions back.
So would it just be a matter of renaming "getetag" to "getctag" ?
It's also in a different namespace.
Last changes since the last sync means the server needs to remember the states that it returned for every sync client so it can diff it, which cannot be done with the current architecture/data model. It was suggested by @icewind1991 here: #4936 (speed up by keeping etag history)
That sounds very similar indeed. What you need from a database perspective:
It's pretty lightweight
So does it mean that for every file change (rename, delete, upload, download, even outside of WebDAV) need to create a new entry there ? Or should these be grouped ?
"the path of the file that was changed": the path can change if the file was moved. So possibly need more info about moved stuff.
It's flexible, you can either do it per file, or per group of changes.
I saw a note about remote storage in one of the tickets as well. I can see that that's a little harder. But even there I assume that you must have some way to figure out whether a tree or subtree has changed within remote storage, even today, right?
So the change for that would be that you now have to figure out what has changed in a remote storage tree, and log those entries as well. You would only have to do this when the sync-token is requested, and it gives you room for future optimization. A big different for remote storage is that the logic to determine changes is now something that happens on the server, not the client.
Plus down the road you get the option to handle changes in remote storage asynchronously.
Unfortunately the external storage stuff relies mostly on recursive scanning (no subtree magic). It checks the mtime of remote folders and compares with the local cache. If it has changed, it goes deeper. It can be slow.
I suppose it should be possible to write this as a separate experimental app/Sabre plugin (if time permits) without being too intrusive.
I figured that was a possibility.. but then at least you have the option to do this processing server-side, and you don't have to let clients do this. I assume that you already need to do something similar to this today?
Sounds like a big rewrite :smile:
I uploaded 1600 files to my local server (on master, with the new Sabre 2.1) and ran a PROPFIND with infinity through backfire. Here is the result: https://blackfire.io/profiles/488be46f-7890-4be2-8e1b-7098961a572e/graph
There aren't many folders, just this:
data/root/files
data/root/files/lots
data/root/files/lots/of
data/root/files/lots/of/files
data/root/files/lots/of/files/500
data/root/files/lots/of/files/1000
data/root/files/lots/of/files/100
The numbers are the number of files inside the respective folders.
Result doesn't look too bad.
Note: this was an "allprops" call. I'll maybe redo another one that also request the OC-specific props next week.
@danimo did your items contain shared files/folders ?
I hope it got better with 8.1 :smile:
I remember @jturcotte did some tests, but not sure if they were about recursive PROPFIND or whether they involved shared folders.
I've set Depth to 1 in all requests: https://github.com/owncloud/administration/blob/master/performance-tests-c%2B%2B/main.cpp#L69
And the test data was only 1 directory deep, so that wouldn't stress it so much either.
Any new data from 8.1.3 ? :smile:
let's move this on -> 9.0
This would be SO nice! Here I have like 12Go and several thousands of files and directory, and SYNC (through thousands of PROPFIND) is taking hours... :(
Since the sync client is not using recursive PROPFIND anymore and the code was removed from the client (since it was still libneon based instead of Qt QNetworkAccessManager) I'm closing this.
I just had a look at the mysql log for a single, non-recursive PROPFIND
operation:
SELECTs
INSERT INTO
oc_file_locks`Can this be correct? In particular, are the writing access operations to oc_file_locks
needed for a purely reading PROPFIND
?
When doing a PROPFIND, you are reading a list of entries in a folder. A concurrent process could make a chance to the folder listing while you are changing it (ex: add/update/remove entries). So the folder currently being listed will be locked with a shared lock to avoid modifications to it during the read operation (also modifications to etags/sizes, etc). Failure to do so could cause unpredictable results, like missed entries or so.
@icewind1991 correct me if I'm wrong
The only thing that comes to my mind what could go wrong, is that a directory size is not equal to the sum of its containing items (in case dir size is somehow obtained first, then a file is changed in parallel, afterwards containing files are "propfound" (but this sounds like an odd implementation if that happened)). It this case the result is inconsistent in itself, however, a PROPFIND should never "see" and report an inconsistent DB state as this must be ensured by the write operations using appropriate locking.
My apologies, I used version 8.2.0. Now using 8.2.2 performance is much better (~factor 100). Probably due to the fact that there are no UPDATE
s nor INSERT
s and only about half the number of SELECT
s.
Re-opening as per @hodyroff @michaelstingl @ogoffart @jturcotte @SergioBertolinSG
… adding @mrow4a to the discussion about performance…
Some data points:
A infinity PROPFIND XML reply taking uncompressed 12 MB (around 2900 directories, around 20.000 files)
Downloads currently in 25 sec using curl
.
I didn't check the server memory usage.
(no memory / XML streaming operations yet).
Found this PR about PROPFIND streaming: https://github.com/fruux/sabre-dav/pull/898
When implementing this, please also try to gzip the response stream.
Right now in our recommended web server configs we disable gzip because some web servers mess with ETags then. But for the text-based PROPFIND it still makes sense.
Right now in our recommended web server configs we disable gzip because some web servers mess with ETags then
We have been using OC-ETAG
snice mirall 1.7.0, so i think we can ignore this issue now. (are we still supporting such clients)
But I wonder about the resuming when downloading files.
Support as in Support Contract and active fixing = latest version only Support as in should work and if not we consider it a bug = 2.0 with 10.0; 1.8 with 9.1 IMHO
bug with Infinity: https://github.com/owncloud/core/issues/28341
This issue has been automatically closed.
Testcase
curl -XPROPFIND https://user:pwd@server/owncloud/remote.php/webdav -H "Depth:infinity"
Actual results
On a well-equipped x86_64 machine it takes 7:20 minutes under heavy server load to list 5279 items (dirs/files).
Expected results
This should actually be fairly cheap, since all required information is available from the DB. The suspicion is that we could attain some serious speedup by investigating code paths and DB queries and optimizing them.
Motivation
Cheap recursive PROPFINDs would allow the client to move away from doing directory-by-directory discovery, which in turn would significantly speed up syncs, especially all sync runs until completing initial sync.