owncloud / core

:cloud: ownCloud web server core (Files, DAV, etc.)
https://owncloud.com
GNU Affero General Public License v3.0
8.4k stars 2.05k forks source link

Enhancement: file hash creation and file scanning #2560

Closed mmattel closed 11 years ago

mmattel commented 11 years ago

Ubuntu 12.04 LTS, oC from github master (core, apps, 3rdparty) from 23.03 ~16:00 oC installed from scratch with mysql, oC and mysql is on a nfs mount point I deleted the current oC install, cleared the db and db oC_user and down/uploaded the files. Client: W7/Opera Two folders mounted from Ubuntu via smb: one music folder containing ~12.500 files one picture folder containing ~17.000 files Those two folders are added via the External Local Storage App as local drives Mysql, NFS and CIFS have been tuned To tune PHP, I installed and configured fcgid and APC

I have multiple terminal sessions open to monitor owncloud.log, apache-error.log, mysql.log, htop ect

Problem 1

When you add the drive via the app, SQL statements are set to prepare oC. But nothing more happens. You have to leave the administration screen and click eg to Files to initiate the next action in oC, which is creating hashes for files found in the drive mapped. If you have many files to process, this takes much time to create the hashes, but there is no disk access reading the files, only db queries. You can watch and often read the sql statements coming in. During this time, you can click into the directory and you can see files, but clicking in eg Gallery shows no folder containing pictures. If hashing is done for the drive mapped, Gallery shows first time possible some content – if you do a browser refresh. As said, before it stays empty. This leads to the idea during hashing, Gallery is broken. Also Memory and CPU usage is low, no swapping. CPU is peak 25% for php (on a 4 core machine) and memory in total is less than 310MB used (see screenshot). What happens in addition, when you logoff during hashing runs, hashing stops. This is in the current implementation by nature as the user is part of the db entry. The consequence is, that hashing does not get finished and has to restart / continue on next user logon.

Problem 2:

Similar to problem 1, but now when hashing is done and you click on music, music starts to create db-entries for music files found. This is not music app related but oC backend related. The first bad thing is, that when you log off, scanning stops. Not immediate, but after a while. Also pressing “Pause” has no or sometimes a action with big delay. This means that you have to be logged on up to the point everything is scanned and do nothing. I did several tries with fresh blank installs and it was not done even over night and I have a well performing environment (see below)… Second, during scanning, files are downloaded to get analyzed and db-data is written. a.) Here I see a huge performance problem. My backend is able to read and write in high values, but only a fraction is used. Music is on Ubuntu mounted smb and can be read with +60Mbytes/s. Monitoring the backend I see maximum 1/10 of the possible thruput. And third, scanning got finished and stopped without any error or complaint. No music data was shown, and ever if I rescan, I end up getting nothing. I also have seen in the mysql log rollbacks, especially when I logged off. Maybe this all is because of many music files to be processed and with only a view files this does not come up.

IO test to local disk / storage:

sudo time sh -c "dd if=/dev/zero of=testfile bs=19k count=5k && sync" smb: write: bs=19k=15MB/s, bs=109k=35MB/s read: bs:19k=62MB/s, bs:109k=70MB/s nfs: write 59MB/s very little difference between read and write local: write 109MB/s (Note: with this data, I can do close to full smb reads and full nfs writes concurrently to saturate my Gbe wire… which I have tested !)

mysql test (sysbench, oltp r/w): transactions: 100001 (809.56 per sec.) deadlocks: 0 (0.00 per sec.) read/write requests: 1900019 (15381.62 per sec.) other operations: 200002 (1619.12 per sec.)

Suggestions:

Question: how is rescanning in case of a source file/folder change (deletion, added, update) made/initiated?

htop 2

BernhardPosselt commented 11 years ago

@icewind1991

mmattel commented 11 years ago

Correction of Problem 1: (...but there is no disk access reading the files, only db queries.)

Files are accessed and you have db-selects/Inserts. This can be clearly distinguished because the DB is on NFS and Files are on CIFS

Means that you have double access, one by hashing, one by music

jancborchardt commented 11 years ago

I’m closing this issue because it has been inactive for a few months. This probably means that the issue is not reproducible or it has been fixed in a newer version.

Please reopen if the error still persists with the latest stable version (currently ownCloud 5.0.9) and then please use the issue template. You an also contribute directly by providing a patch – see the developer manual. :)

Thank you!