owncloud / client

🖥️ Desktop Syncing Client for ownCloud
GNU General Public License v2.0
1.38k stars 667 forks source link

Batch WebDAV calls for DELETE #917

Open guruz opened 10 years ago

guruz commented 10 years ago

(This is more of a RFC right now..)

Whenever we need to do a DELETE (or MOVE or...) on multiple files in a directory, we need to send a request to the server for each file. The roundtrip for that kills performance, especially with slow file backends server side.

Microsoft specifies a BDELETE (batch delete) command http://msdn.microsoft.com/en-us/library/exchange/aa142716(v=exchg.65).aspx that allows to delete multiple files (in a single folder?) with one command.

We could implement this and use it to improve performance.

Things making this hard are probably the handling of error cases when some files were deleted, but some files were not. We can find out if the server supports it using the OPTIONS verb.

Any comments @ogoffart @evert @dragotin

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/904789-batch-webdav-calls-for-delete?utm_campaign=plugin&utm_content=tracker%2F216457&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F216457&utm_medium=issues&utm_source=github).
stephane-martin commented 10 years ago

If you want to move a directory and all files in it, WEBDAV supports operations on collections. If you want to move several files in a directory, AFAIK webdav standard doesnt have to custom tool to do that, some many requests are necessary. (Caldav has a multiget, but that doesnt help you!)

Thats why webdav servers in PHP suck so much, compared to Apache mod_dav for exemple : each request generates a PHP request, and that can easily make php management process crash or waiiiiiiit for quite a long time until children are available.

Regards, Stephane

On Tue, Aug 27, 2013 at 2:58 PM, Markus Goetz notifications@github.comwrote:

(This is more of a RFC right now..)

Whenever we need to do a DELETE (or MOVE or...) on multiple files in a directory, we need to send a request to the server for each file. The roundtrip for that kills performance, especially with slow file backends server side.

Microsoft specifies a BDELETE (batch delete) command http://msdn.microsoft.com/en-us/library/exchange/aa142716(v=exchg.65).aspxthat allows to delete multiple files (in a single folder?) with one command.

We could implement this and use it to improve performance.

Things making this hard are probably the handling of error cases when some files were deleted, but some files were not. We can find out if the server supports it using the OPTIONS verb.

Any comments @ogoffart https://github.com/ogoffart @everthttps://github.com/evert @dragotin https://github.com/dragotin

— Reply to this email directly or view it on GitHubhttps://github.com/owncloud/mirall/issues/917 .

guruz commented 10 years ago

That is exactly my point: If it is not about the whole collection but multiple files inside one collection, what Microsoft does it specificed the BDELETE. I think we could use that one too. Right now many requests are needed.

stephane-martin commented 10 years ago

I don't think SabreDAV implements that (see below). It can be done by subclassing Sabre_DAV_Tree. I can propose patch, but should we submit it to upstream SabreDAV project, or just keep it in owncloud source ?

Regards, Stephane

Sabre/DAV/Server.php :

public function invokeMethod($method, $uri) {

    $method = strtoupper($method);

    if (!$this->broadcastEvent('beforeMethod',array($method, $uri)))

return;

    // Make sure this is a HTTP method we support
    $internalMethods = array(
        'OPTIONS',
        'GET',
        'HEAD',
        'DELETE',
        'PROPFIND',
        'MKCOL',
        'PUT',
        'PROPPATCH',
        'COPY',
        'MOVE',
        'REPORT'
    );

On Wed, Aug 28, 2013 at 1:01 PM, Markus Goetz notifications@github.comwrote:

That is exactly my point: If it is not about the whole collection but multiple files inside one collection, what Microsoft does it specificed the BDELETE. I think we could use that one too. Right now many requests are needed.

— Reply to this email directly or view it on GitHubhttps://github.com/owncloud/mirall/issues/917#issuecomment-23406149 .

evert commented 10 years ago

I'm mostly curious why the number of HTTP requests would be a noticeable bottleneck for you. Even if it's hundreds, with HTTP pipelining this should go quite fast.. Are you using a crappy http client?

For bulk requests, the spec I'm mostly interested in, is this one:

https://github.com/evert/calendarserver-extensions/blob/master/calendarserver-bulk-change.txt

I think it's better than the microsoft implementation. But a plugin for the microsoft batch methods would still be kinda cool =)

stephane-martin commented 10 years ago

Id rather follow apple spec as it has more details.

Apple spec is extending caldav. But the behaviour about bulk requests is valid for plain webdav too. What should we do then?

Bottleneck is not http or http server. It's php. Each http request generates some work for a php child in sabredav. With bulk request, only one http request, so only one php child involved.

Im willing to work on that. It will need a bit of time though. We can expect fair improvement in sync client.

Regards, Stephane Le 28 août 2013 17:23, "evert" notifications@github.com a écrit :

I'm mostly curious why the number of HTTP requests would be a noticeable bottleneck for you. Even if it's hundreds, with HTTP pipelining this should go quite fast.. Are you using a crappy http client?

For bulk requests, the spec I'm mostly interested in, is this one:

https://github.com/evert/calendarserver-extensions/blob/master/calendarserver-bulk-change.txt

I think it's better than the microsoft implementation. But a plugin for the microsoft batch methods would still be kinda cool =)

— Reply to this email directly or view it on GitHubhttps://github.com/owncloud/mirall/issues/917#issuecomment-23423258 .

tanghus commented 10 years ago

I'm interested in hearing what you come up with here. I have sort of the same issue in owncloud/contacts#110 though not limited to *DAV requests.

MTRichards commented 10 years ago

@guruz Isn't this also something we have a problem with on another project? Does any of that work apply here, or do we also need to fix this here, and thus solve that one too?

ogoffart commented 10 years ago

Mirall 1.5 does only one call to DELETE when deleting a directory and all it's content. But it does not batch unrelated deletion in a single BDELETE. I don't even know if the server supports that.

guruz commented 9 years ago

Change of plans: Let's implement DELETE with QNAM (instead of current neon code), then we can parallelize the DELETE calls and don't need server side changes.

guruz commented 9 years ago

With 1.8 we have parallel MKCOL/DELETE(/MOVE)