owncloud / core

:cloud: ownCloud web server core (Files, DAV, etc.)
https://owncloud.com
GNU Affero General Public License v3.0
8.38k stars 2.06k forks source link

Trashbin auto-expire causes PHP timeout on delete #13974

Closed pokoli closed 9 years ago

pokoli commented 9 years ago

I'm using the owncloud app to sync my local files with remote storage.

Steps to reproduce

  1. Sync some of your folders with a remote files
  2. Delete a buch of files in the local folder of owncloud directory. In my case I deleted ~900 files.

    Expected behaviour

The files should be deleted on the owncloud server without errors.

Actual behaviour

The owncloud server raises timeout errors when deleting the files. I had a timeout of 5 minutes, and modified php-fpm configuration to timeout on 10 and the problem still persits. I belive that 10 minutes to delete a single file is too much and something is wrong.

I can see some deleted files in the deleted files view in the web interface. For example is see that file Imatges/Whatsapp/IMG-20121002-WA0001.jpg is deleted, but as you can see in the log it raises a timeout on the client.

Server configuration

Operating system: Debian 7.8 Web server: Nginx + PHP-FPM Database: MariaDB 5.5.35 PHP version: PHP 5.4.36

ownCloud version: ownCloud 7.0.4 (stable) Updated from an older ownCloud or fresh install: Upgraded from an older version

List of activated apps: Activity, Calendar, Contacts, Deleted Files, Documents, First Run Wizard, PDF Viewer, Pictures, Share Files, Text Editor, Updated, Versions, Video Viewer.

The content of config/config.php:

$CONFIG = array (
  'instanceid' => 'ocb7caf11ec2',
  'passwordsalt' => 'xxxxxxxxxxxx',
  'trusted_domains' =>
  array (
    0 => 'xxxxxxxxxxxx.mydomain.com',
  ),  
  'datadirectory' => '/owncloud',
  'dbtype' => 'mysql',
  'version' => '7.0.4.2',
  'dbname' => 'owncloud',
  'dbhost' => 'localhost',
  'dbtableprefix' => 'oc_',
  'dbuser' => 'owncloud',
  'dbpassword' => 'password',
  'installed' => true,
  'forcessl' => true,
  'theme' => '',
  'maintenance' => false,
  'share_folder' => '/Shared',
  'mail_from_address' => 'files',
  'mail_smtpmode' => 'php',
  'mail_domain' => 'mydomain.com',
  'secret' => 'xxxxxxxxxxxx',
);

Are you using external storage, if yes which one: No external storage

Are you using encryption: no

Client configuration

Browser: Firefox 35.0.1 Operating system: Linux Mint Rebeca (Ubuntu 14.04) Mirall version: 1.7.1

Logs

Web server error log

2015/02/09 11:33:32 [error] 20380#0: *1524466 upstream timed out (110: Connection timed out) while reading response header from upstream, client: ZZZ.ZZZ.ZZZ.ZZZ, server: xxx.mydomain.com, request: "DELETE /remote.php/webdav/Imatges/Whatsapp/IMG-20121002-WA0001.jpg HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "xxx.mydomain.com"

There a lot of entries like this.

And in mirall i see the following output:

11.44     Imatges/Whatsapp/IMG-20120929-WA0000.jpg                        ownCloud       504 Gateway Time-out          

ownCloud log (data/owncloud.log)

Nothing relevant on owncloud.log only errors related to mysql restart i have done manually.

Let me know if anything else is required to properly fix the bug.

pokoli commented 9 years ago

Some of the queries running on the mysql database:

MariaDB [owncloud]> show processlist;
+------+----------+-----------+----------+---------+------+-----------+----------------------------------------------------------------+----------+
| Id   | User     | Host      | db       | Command | Time | State     | Info                                                           | Progress |
+------+----------+-----------+----------+---------+------+-----------+----------------------------------------------------------------+----------+
|  475 | root     | localhost | owncloud | Query   |    0 | NULL      | show processlist                                               |    0.000 |
| 1469 | owncloud | localhost | owncloud | Sleep   |    0 |           | NULL                                                           |    0.000 |
| 1504 | owncloud | localhost | owncloud | Query   |    0 | query end | UPDATE `oc_filecache` SET `size`='0' WHERE `fileid` = '401205' |    0.000 |
| 1510 | owncloud | localhost | owncloud | Sleep   |    0 |           | NULL                                                           |    0.000 |
| 1589 | owncloud | localhost | owncloud | Sleep   |    0 |           | NULL                                                           |    0.000 |
+------+----------+-----------+----------+---------+------+-----------+----------------------------------------------------------------+----------+
5 rows in set (0.00 sec)

MariaDB [owncloud]> show processlist;
+------+----------+-----------+----------+---------+------+-----------+------------------------------------------------------------------------------------------------------+----------+
| Id   | User     | Host      | db       | Command | Time | State     | Info                                                                                                 | Progress |
+------+----------+-----------+----------+---------+------+-----------+------------------------------------------------------------------------------------------------------+----------+
|  475 | root     | localhost | owncloud | Query   |    0 | NULL      | show processlist                                                                                     |    0.000 |
| 1469 | owncloud | localhost | owncloud | Query   |    0 | NULL      | commit                                                                                               |    0.000 |
| 1504 | owncloud | localhost | owncloud | Query   |    0 | query end | UPDATE `oc_filecache` SET `size`='0' WHERE `fileid` = '401243'                                       |    0.000 |
| 1510 | owncloud | localhost | owncloud | Query   |    0 | NULL      | commit                                                                                               |    0.000 |
| 1589 | owncloud | localhost | owncloud | Query   |    0 | NULL      | SELECT `fileid` FROM `oc_filecache` WHERE `storage` = '1' AND `path_hash` = '398a62fde4e7ae45e399ccd |    0.000 |
+------+----------+-----------+----------+---------+------+-----------+------------------------------------------------------------------------------------------------------+----------+
5 rows in set (0.00 sec)

MariaDB [owncloud]> select count(1) from oc_filecache;
+----------+
| count(1) |
+----------+
|   278365 |
+----------+
1 row in set (0.27 sec)

Which it seems to have all the heavy load on the server side.

DeepDiver1975 commented 9 years ago

We improved delete operation with oc8. Please retest these says after release.

PVince81 commented 9 years ago

OC 8 is out, please retest :smile:

The problem is that files were copied to trashbin instead of moving. Now in OC 8 they are moved, so should be much faster.

pokoli commented 9 years ago

Exactly the same result with oc8

Here is the full server log of sync:

176.84.20.38 - - [09/Feb/2015:20:03:18 +0100] "PROPFIND /remote.php/webdav HTTP/1.1" 401 253 "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
176.84.20.38 - sergi [09/Feb/2015:20:03:18 +0100] "PROPFIND /remote.php/webdav HTTP/1.1" 207 39213 "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
176.84.20.38 - sergi [09/Feb/2015:20:03:19 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:03:23 +0100] "PROPFIND /remote.php/webdav/Imatges HTTP/1.1" 207 147956 "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
176.84.20.38 - sergi [09/Feb/2015:20:03:24 +0100] "PROPFIND /remote.php/webdav/Imatges/pokoli HTTP/1.1" 207 32616 "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
176.84.20.38 - sergi [09/Feb/2015:20:03:29 +0100] "PROPFIND /remote.php/webdav/Imatges/amics HTTP/1.1" 207 77967 "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
176.84.20.38 - sergi [09/Feb/2015:20:03:33 +0100] "PROPFIND /remote.php/webdav/Imatges/Whatsapp HTTP/1.1" 207 841628 "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
176.84.20.38 - sergi [09/Feb/2015:20:03:50 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:04:20 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:04:50 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:05:21 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - - [09/Feb/2015:20:05:46 +0100] "POST /index.php/heartbeat HTTP/1.1" 200 31 "https://files.koolpi.com/index.php/apps/files/?dir=%2F&view=trashbin" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:35.0) Gecko/20100101 Firefox/35.0"
176.84.20.38 - sergi [09/Feb/2015:20:05:51 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:06:22 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:06:52 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:07:23 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:07:53 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:08:23 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:08:41 +0100] "DELETE /remote.php/webdav/Imatges/Whatsapp/IMG-20121010-WA0003.jpg HTTP/1.1" 504 182 "-" "Mozilla/5.0 (Linux) csyncoC/0.91.5 neon/0.30.0"
176.84.20.38 - - [09/Feb/2015:20:08:49 +0100] "POST /index.php/heartbeat HTTP/1.1" 200 31 "https://files.koolpi.com/index.php/settings/apps" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:35.0) Gecko/20100101 Firefox/35.0"
176.84.20.38 - sergi [09/Feb/2015:20:08:54 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"
176.84.20.38 - sergi [09/Feb/2015:20:09:25 +0100] "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 525 "-" "Mozilla/5.0 (Linux) mirall/1.7.1"

Nevertheless the file file which caused the timeout appears as deleted on the deleted files directory.

PVince81 commented 9 years ago

I only see a single "DELETE" operation. Are you saying that this one failed too ? The issue I mentioned above was when there are lots of files which risk making PHP run into a timeout situation.

If in your case it happens also for single files, then it's another bug. :frowning: It looks like it's returning 504 which is "Gateway timeout". Very strange.

pokoli commented 9 years ago

All the DELETE operations fail with a timeout on the php backend (php-fpm), so nginx returns the 504 error. php-fpm is configured to kill the process after 10 minutes of running. What I'm saying is that there is something wrong because it lasts 10 minutes to delete a file.

In the other hand, the file which causes the timeout appears correctly on deleted files, but the requests didn't end correctly as it produced the timeout. So it seems that all the work is done correctly, but the requests never ends.

I did delete a lot of files on a previous version (i think it was 6.0) without any problems and the process was quite fast.

Let me know if you need any more info. If you give point to the source code on the delete part i may debug it further to provide more information.

PVince81 commented 9 years ago

I see encryption is disabled (which would have slowed down deletion due to share keys handling), so that is ruled out.

Was that file shared with many users ?

PVince81 commented 9 years ago

Random shot in the dark: did you at some point have symlinks in your "data" directory ? (because those could cause infinite loops)

pokoli commented 9 years ago

The file is not shared. And i checked and no i don't have symlinks.

root@myserver:/owncloud# find . -type l
root@myserver:/owncloud# 

/owncloud is my data path

PVince81 commented 9 years ago

Hmm... still clueless. Can you maybe try setting "loglevel" to 0 in the config and then run the deletion again ?

And then next try: disable the app "Deleted files" then delete a file. To rule out whether it's related to the trashbin app or something else.

pokoli commented 9 years ago

Log with trashbing:

{"reqId":"db6a575e3a2c0500171ca94e6fdf44eb","remoteAddr":"","app":"PHP","message":"Undefined index: REQUEST_URI at \/var\/www\/xxx.mydomain.com\/apps\/calendar\/appinfo\/app.php#41","level":3,"time":"2015-02-10T10:15:02+00:00"}
{"reqId":"525cd0d69bed8b7fee9da671ec4a2f51","remoteAddr":"176.84.20.38","app":"music","message":"delete - 21206","level":0,"time":"2015-02-10T10:29:53+00:00","method":"DELETE","url":"\/remote.php\/webdav\/Imatges\/Whatsapp\/IMG-20130423-WA0002.jpg"}
{"reqId":"525cd0d69bed8b7fee9da671ec4a2f51","remoteAddr":"176.84.20.38","app":"music","message":"removed entities - albums: [], artists: []","level":0,"time":"2015-02-10T10:29:53+00:00","method":"DELETE","url":"\/remote.php\/webdav\/Imatges\/Whatsapp\/IMG-20130423-WA0002.jpg"}
{"reqId":"be18bdb976f27614f360184d96d75cb9","remoteAddr":"","app":"PHP","message":"Undefined index: REQUEST_URI at \/var\/www\/xxx.mydomain.com\/apps\/calendar\/appinfo\/app.php#41","level":3,"time":"2015-02-10T10:30:03+00:00","method":"--","url":"--"}
{"reqId":"73ac1de7ecfb95ee5a06fc80f4da5edc","remoteAddr":"91.126.253.252","app":"webdav","message":"Exception: {\"Message\":\"No basic authentication headers were found\",\"Code\":0,\"Trace\":\"#0 \\\/var\\\/www\\\/xxx.mydomain.com\\\/lib\\\/private\\\/connector\\\/sabre\\\/auth.php(123): Sabre\\\\DAV\\\\Auth\\\\Backend\\\\AbstractBasic->authenticate(Object(OC_Connector_Sabre_Server), 'ownCloud')\\n#1 \\\/var\\\/www\\\/xxx.mydomain.com\\\/lib\\\/private\\\/connector\\\/sabre\\\/auth.php(103): OC_Connector_Sabre_Auth->auth(Object(OC_Connector_Sabre_Server), 'ownCloud')\\n#2 \\\/var\\\/www\\\/xxx.mydomain.com\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/Sabre\\\/DAV\\\/Auth\\\/Plugin.php(108): OC_Connector_Sabre_Auth->authenticate(Object(OC_Connector_Sabre_Server), 'ownCloud')\\n#3 [internal function]: Sabre\\\\DAV\\\\Auth\\\\Plugin->beforeMethod('PROPFIND', '')\\n#4 \\\/var\\\/www\\\/xxx.mydomain.com\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/Sabre\\\/DAV\\\/Server.php(433): call_user_func_array(Array, Array)\\n#5 \\\/var\\\/www\\\/xxx.mydomain.com\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/Sabre\\\/DAV\\\/Server.php(455): Sabre\\\\DAV\\\\Server->broadcastEvent('beforeMethod', Array)\\n#6 \\\/var\\\/www\\\/xxx.mydomain.com\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/Sabre\\\/DAV\\\/Server.php(214): Sabre\\\\DAV\\\\Server->invokeMethod('PROPFIND', '')\\n#7 \\\/var\\\/www\\\/xxx.mydomain.com\\\/apps\\\/files\\\/appinfo\\\/remote.php(61): Sabre\\\\DAV\\\\Server->exec()\\n#8 \\\/var\\\/www\\\/xxx.mydomain.com\\\/remote.php(54): require_once('\\\/var\\\/www\\\/files....')\\n#9 {main}\",\"File\":\"\\\/var\\\/www\\\/xxx.mydomain.com\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/Sabre\\\/DAV\\\/Auth\\\/Backend\\\/AbstractBasic.php\",\"Line\":73}","level":0,"time":"2015-02-10T10:30:27+00:00","method":"PROPFIND","url":"\/remote.php\/webdav"}

Worked well with trashbin disabled, so it's something related to trashbin.

PVince81 commented 9 years ago

Is your trashbin very full ?

Another theory for the slowness is that the trashbin tries to expire old files automatically. If there are many such files, it might be slow. If you have lots of files, you could try and manually empty the trashbin, then try deleting a file again.

CC @schiesbn

pokoli commented 9 years ago

Do you think this is very full?

MariaDB [owncloud]> select count(1) from oc_files_trash;
+----------+
| count(1) |
+----------+
|     5186 |
+----------+
1 row in set (0.00 sec)

I just selected all files in trashbin and deleted them, i will keep you update when the process ends, but i think it will last some time.

There is some method to empty the trashbin instead of selecting all and delete them?

I was thinking on creating some app to expire files on trashbin after xxx days. Do you think this is interesting for core?

PVince81 commented 9 years ago

There is already a request for a CLI tool to auto-clear the trashbin but I can't seem to find it

pokoli commented 9 years ago

So the timeout disapears when the trashbin is empty, so its related to the size of the trashbin. Maybe the issue it's #8109

PVince81 commented 9 years ago

Ok thanks for the info. We should look into improving trashbin expiration performance.

PVince81 commented 9 years ago

@schiesbn how about auto-expiring as a background job ?

DeepDiver1975 commented 9 years ago

@schiesbn how about auto-expiring as a background job ?

the way to go from my understanding - moving the trashbin's expiry hook into a background job should not be that complicated

DeepDiver1975 commented 9 years ago

the way to go from my understanding - moving the trashbin's expiry hook into a background job should not be that complicated

@icewind1991 we did talk about this today - let us know about your progress - THX

icewind1991 commented 9 years ago

Looking into added laravel's background queue system to handle this atm

shoeper commented 9 years ago

Try to increase fastcgi_read_timeout in your nginx conf. 504 is afaik no php but a gateway timeout. And while moving files to the trash bin ownCloud doesn't send any data that's why you get a 504. See: http://nginx.org/en/docs/http/ngx_http_fastcgi_module.html#fastcgi_read_timeout

DeepDiver1975 commented 9 years ago

trashbin expiry will be done in a background process starting 8.1 - see https://github.com/owncloud/core/pull/14644

PVince81 commented 9 years ago

There are three tasks to address trashbin slowness:

DeepDiver1975 commented 9 years ago

should be fixed for 8.1

PVince81 commented 9 years ago

"Make trashbin expiration faster for OC < 8.1" (backport) still needs to be reviewed: https://github.com/owncloud/core/pull/14926