Open piotr-sikora-v opened 2 years ago
you may want to open a pull request for this
I'am not good in Symfony so I don't think is good code ;) Maybe better is this:
private function getUsersToScan(): array {
$sql = 'SELECT MAX(`user_id`) FROM `oc_filecache` `f` FORCE INDEX(`fs_storage_size`)
INNER JOIN `oc_mounts` `m` ON `storage_id` = `storage` WHERE `size` < 0 GROUP BY `storage_id`
LIMIT'.self::USERS_PER_SESSION;
$em = $this->connection()->prepare($sql);
$stmt->execute();
return $stmt->fetchAll();
}
Did you file a pull request for this?
Hello, I can confirm the force index helps a lot, and I am also unable to translate that into Symfony's query builder.
This is on nextcloud 25.0.6 with MariaDB 10.3.38.
Here are the raw explain with timings :
MariaDB [cloud]> ANALYZE SELECT `user_id` FROM `oc_filecache` `f` INNER JOIN `oc_mounts` `m` ON `storage_id` = `storage` WHERE (`size` < 0) AND (`parent` > -1) LIMIT 1;
+------+-------------+-------+-------+-------------------------------------------------------------------------------------------------------------------------------------------+------------------------+---------+--------------------+------+-----------+----------+------------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
+------+-------------+-------+-------+-------------------------------------------------------------------------------------------------------------------------------------------+------------------------+---------+--------------------+------+-----------+----------+------------+-------------+
| 1 | SIMPLE | m | index | mounts_storage_index,mount_user_storage | mount_user_storage | 266 | NULL | 2248 | 2300.00 | 100.00 | 100.00 | Using index |
| 1 | SIMPLE | f | ref | fs_storage_path_hash,fs_parent_name_hash,fs_storage_mimetype,fs_storage_mimepart,fs_storage_size,fs_size,fs_storage_path_prefix,fs_parent | fs_storage_path_prefix | 8 | cloud.m.storage_id | 65 | 162403.30 | 100.00 | 0.00 | Using where |
+------+-------------+-------+-------+-------------------------------------------------------------------------------------------------------------------------------------------+------------------------+---------+--------------------+------+-----------+----------+------------+-------------+
2 rows in set (8 min 0,984 sec)
MariaDB [cloud]> ANALYZE SELECT `user_id` FROM `oc_filecache` `f` FORCE INDEX(`fs_storage_size`) INNER JOIN `oc_mounts` `m` ON `storage_id` = `storage` WHERE (`size` < 0) AND (`parent` > -1) LIMIT 1;
+------+-------------+-------+------+-----------------------------------------+--------------------+---------+-----------------+--------+-----------+----------+------------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | r_rows | filtered | r_filtered | Extra |
+------+-------------+-------+------+-----------------------------------------+--------------------+---------+-----------------+--------+-----------+----------+------------+-------------+
| 1 | SIMPLE | f | ALL | fs_storage_size | NULL | NULL | NULL | 538819 | 604780.00 | 100.00 | 17.15 | Using where |
| 1 | SIMPLE | m | ref | mounts_storage_index,mount_user_storage | mount_user_storage | 8 | cloud.f.storage | 2 | 0.00 | 100.00 | 100.00 | Using index |
+------+-------------+-------+------+-----------------------------------------+--------------------+---------+-----------------+--------+-----------+----------+------------+-------------+
2 rows in set (0,299 sec)
Those on this thread might find the discussion in #24401 of interest as well.
I found something that really speed up files:scan-app-data cron
function getUsersToScan in ./lib/BackgroundJob/ScanFiles.php generarate this SQL:
but it use fs_size index on oc_filecache, and I think is wrong
Better will work to use fs_storege_size index
For my instance it took about 10 minutes on fs_size index, but on fs_storage_size it takes 2-3 seconds!
So my suggestion is change function to: