Closed XueSheng-GIT closed 1 month ago
We have 600k lines on oc_filecache table and file search is almost unusable, taking more than 30 seconds to show results. On top of that, Nextcloud unified search is looking for exact words, in the same order as user input which is not very efficient.
So, we have decided to write our own search form, outside of Nextcloud (added to the app menu using external sites app, as we are not able to make a real Nextcloud App), but using the same database (mariadb here). We've added some extra search functionalities :
Search is based on the user profile, so only the files shared to, or owned by the connected user appear on the results.
Surprisingly, our search engine is significantly faster than the official nextcloud one. It takes about 2 seconds to display results. There is certainly a lot of possible optimizations to do.
Feel free to comment, use, adapt this bunch of code as we are not going to make an app or whatever with it. If it may helps someone...
<?php
/*
Nextcloud alternative search engine offering :
- search by file id
- search by user name
- search between 2 dates
- search all the input terms in various order
- search is accent insensitive
- export results in CSV
search results are based on the user in $_SESSION["username"] variable
you have to write your own user authentication to make it works...
*/
session_start();
// function returning the icon corresponding to a mime type
function getImgExtension($fileName, $mime)
{
$file = explode('.', $fileName);
switch ($file[sizeof($file) - 1]) {
case 'doc':
return 'icons/doc.png';
case 'docx':
return 'icons/doc.png';
case 'jpg':
return 'icons/jpg.png';
case 'jpeg':
return 'icons/jpg.png';
case 'mp3':
return 'icons/mp3.png';
case 'pdf':
return 'icons/pdf.png';
case 'png':
return 'icons/png.png';
case 'ppt':
return 'icons/ppt.png';
case 'pptx':
return 'icons/ppt.png';
case 'csv':
return 'icons/csv.png';
case 'csv':
return 'icons/csv.png';
case 'csvx':
return 'icons/csv.png';
case 'zip':
return 'icons/zip.png';
case '7z':
return 'icons/zip.png';
case 'eml':
return 'icons/mail.png';
case 'rar':
return 'icons/rar.png';
case 'km':
return 'icons/mind.png';
default:
if ($mime == 2) {
return 'icons/folder.png';
} else {
return 'icons/unknown.png';
}
}
}
// Get rid of the accented letters
function stripAccents($str) {
return strtr(utf8_decode($str), utf8_decode("àáâãäçèéêëìíîïñòóôõöùúûüýÿÀÁÂÃÄÇÈÉÊËÌÍÎÏÑÒÓÔÕÖÙÚÛÜÝ'"), "aaaaaceeeeiiiinooooouuuuyyAAAAACEEEEIIIINOOOOOUUUUY ");
}
$bdd = mysqli_connect("host", "user", "password", "database");
if (!$bdd) {
die('Connection error (' . mysqli_connect_errno() . ') '
. mysqli_connect_error());
}
if (isset($_POST['documentName']) || isset($_POST['documentOwnerUsername'])) {
// Constructing the query used to retrieve all visible mount points to the user, excluding trash_bin
$sql = "SELECT `storage_id`, `root_id`, `user_id`, `mount_point`, `mount_id`, `f`.`path`, `mount_provider_class`,id FROM `oc_mounts` `m` INNER JOIN `oc_filecache` `f` ON `m`.`root_id` = `f`.`fileid` WHERE `user_id` = '" . $_SESSION["username"] . "' and path not like '%trashbin%' order by path asc";
$mounts = mysqli_query($bdd, $sql);
$paths = "(1=0";
$storage="";
while ($mount = mysqli_fetch_array($mounts)) {
if ($storage != $mount["storage_id"] || strpos($mount["path"], $path) === false) {
$paths .= " OR (path like '" . addslashes($mount["path"]) . "%' and storage=" . $mount["storage_id"] . " and oc_mounts.id=" . $mount["id"] . ")\n";
$storage = $mount["storage_id"];
$path = $mount["path"];
}
}
$paths .= ")";
// Constructing the query used to retrieve all files matching the search query (all terms must be present, in any order)
$criteres = "((1=1";
$debug = false;
if ($_POST['documentName'] != "") {
$termes = explode(" ", stripAccents(strtolower($_POST['documentName'])));
foreach ($termes as $terme) {
// lots of chained sql "replace" here because of normalization of filename
$criteres .= " and (concat(' ',replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(lower(name),'é','e'),'è','e'),'ë','e'),'ê','e'),'ô','o'),'à','a'),'ï','i'),'î','i'),'â','a'),'ù','u'),'\'',' '),'.',' '),' ') like '% " . addslashes($terme) . " %'\n";
// eventually removing trailing 's' to search term without plural
if (substr($terme, -1) == "s") {
$criteres .= " or concat(' ',replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(lower(name),'é','e'),'è','e'),'ë','e'),'ê','e'),'ô','o'),'à','a'),'ï','i'),'î','i'),'â','a'),'ù','u'),'\'',' '),'.',' '),' ') like '% " . addslashes(substr($terme, 0, -1)) . " %')\n";
} else {
// replacing trailing "aux" by "al", as plural of most word finishing by "al" is "aux" in french (cheval => chevaux)
if (substr($terme, -3) == "aux") {
$criteres .= " or concat(' ',replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(lower(name),'é','e'),'è','e'),'ë','e'),'ê','e'),'ô','o'),'à','a'),'ï','i'),'î','i'),'â','a'),'ù','u'),'\'',' '),'.',' '),' ') like '% " . addslashes(substr($terme, 0, -3)) . "al %')\n";
}
// adding a trailing 's' to search plural too
else {
$criteres .= " or concat(' ',replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(lower(name),'é','e'),'è','e'),'ë','e'),'ê','e'),'ô','o'),'à','a'),'ï','i'),'î','i'),'â','a'),'ù','u'),'\'',' '),'.',' '),' ') like '% " . addslashes($terme) . "_ %')\n";
}
}
}
}
// searching by fileid
$criteres .= ") or fileid='" . addslashes($_POST["documentName"]) . "')";
// base search query, excluding file versions and trashbin
$sqlid = "select fileid from oc_filecache where " . $criteres . " and path not like 'files_versions%' and path not like 'files_trashbin%'";;
// if dates have been specified, search only files modified between these dates
if ($_POST["date"] != "") {
$splDate = explode("&", $_POST['date']);
$dStart = DateTime::createFromFormat('d-m-Y', $splDate[0]);
$dEnd = DateTime::createFromFormat('d-m-Y', $splDate[1]);
$sqlid .= " AND mtime BETWEEN " . $dStart->getTimestamp() . " AND " . $dEnd->getTimestamp();
}
$resid = mysqli_query($bdd, $sqlid);
$listeid = "-1";
// Constructing the list of matching id, regardless of the user right access for the moment
while ($id = mysqli_fetch_array($resid)) {
$listeid .= "," . $id["fileid"];
}
// Filtering results with visible mount points
$sql = "select fileid,name,mtime,ldap_dn,mimetype,mount_point,path,oc_storages.id as proprio,parent from oc_filecache,oc_storages,oc_ldap_user_mapping,oc_mounts
where storage=numeric_id and (owncloud_name=substring_index(oc_storages.id,'::',-1) or (storage = 192 and owncloud_name='adminsi')) and oc_mounts.user_id='" . $_SESSION["username"] . "' and
" . $paths . " AND fileid in (" . $listeid . ")";
// If a username has been specified, filtering by username
if ($_POST["documentOwner"] != "") {
$sql .= " AND owncloud_name='" . $_POST["documentOwner"] . "' ";
}
// ordering results by modification time
$sql .= " order by mtime +0 desc";
$res = mysqli_query($bdd, $sql);
$nbdoc = mysqli_num_rows($res);
}
?>
<!doctype html>
<html lang="fr">
<head>
<!-- Required meta tags -->
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- Bootstrap CSS -->
<link href="bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="transition.css" rel="stylesheet">
<link rel="stylesheet" type="text/css" href="https://cdn.jsdelivr.net/npm/daterangepicker/daterangepicker.css" />
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@tarekraafat/autocomplete.js@10.2.7/dist/css/autoComplete.min.css">
<title>Moteur de recherche cloud PMA</title>
</head>
<body id="body">
<?php
if (isset($_SESSION['username'])) { ?>
<div class="container">
<div class="row">
<a href="logout.php" style="text-align: right; color: grey;">Disconnect</a>
</div>
<div class="row">
<h1>Advanced search</h1>
</div>
<form method="POST" autocomplete="off">
<div class="row mt-2">
<div class="row">
<div class="col mt-3">
<label for="documentName" class="form-label" id="documentNameLabel">Filename or file id</label>
<input type="text" class="form-control" id="documentName" placeholder="space separated terms" name="documentName" value="<?php echo (isset($_POST['documentName']) ? $_POST['documentName'] : '') ?>">
</div>
<div class="col mt-3">
<label for="documentOwner" class="form-label">Owner</label>
<input type="text" name="documentOwner" id="documentOwner" class="form-control">
<span id="search_result"></span>
</div>
<div class="col m-3">
<div class="row">
<div class="col">
<label for="date" class="form-label">Modification time</label>
<div id="reportrange" style="background: #fff; cursor: pointer; padding: 5px 10px; border: 1px solid #ccc; width: 100%">
<i class="fa fa-calendar"></i>
<span></span> <i class="fa fa-caret-down"></i>
</div>
<input type="hidden" value="" name="date" id="date">
</div>
</div>
</div>
</div>
<div class="row">
<div class="col">
<input class="btn btn-primary" type="submit" value="Rechercher">
</div>
<div class="col-10"></div>
</div>
</form>
<div class="row mt-4">
<hr style="color:#932D30">
</div>
<?php if (isset($nbdoc)) { ?>
<div class="row mt-3">
<?= $nbdoc ?> result(s)
</div>
<div class="row mt-5 align-center">
<div class="col">
<?php
$_SESSION["csv"] = "CID;Filename;Owner;Modification time;Path\n";
// We use a LDAP backend, if you don't, you should modify this
foreach ($res as $r) {
$owner = explode(",", $r['ldap_dn'], 2);
$owner = str_replace("cn=", "", $owner[0]);
?>
<div class="card mt-2 mb-2">
<a href="<?= 'https://yourcloud.com/index.php/f/' . $r['fileid'] ?>" target="_blank">
<div class="card-body">
<h5 class="card-title"><img src="<?= getImgExtension($r['name'], $r['mimetype']) ?>" alt="" width="30" height="30"> <?= $r['name'] ?></h5>
<p class="card-text"><span style="color: 808080;">
<?php
// Computing the file path for this user when possible
$tchemin = explode("/", $r["path"]);
$chemin = "";
if (strpos($r["path"], "groupfolders") !== false) {
for ($i = 2; $i < sizeof($tchemin) - 1; $i++) {
$chemin .= $tchemin[$i] . "/";
}
} else {
$correspondance = false;
for ($i = 1; $i < sizeof($tchemin) - 1; $i++) {
$chemin .= $tchemin[$i] . "/";
if ($p[1] != $_SESSION["username"] && $tchemin[$i] == substr($r['mount_point'], strlen($_SESSION["username"]) + 8, -1)) {
$chemin = "";
$correspondance = true;
}
}
}
// If the shared file have been renamed, we cannot find the real path
if ($chemin == "") $chemin = "/";
if ($r["name"] == substr($r['mount_point'], strlen($_SESSION["username"]) + 8, -1)) {
$dans = " - <a href=\"https://yourcloud.com/index.php/apps/files/\">Root folder</a>";
$danscsv = "Root folder";
} else {
$dans = " - <a href=\"https://yourcloud.com/index.php/f/" . $r['parent'] . "\" target=\"_blank\">In the folder " . substr($r['mount_point'], strlen($_SESSION["username"]) + 8) . $chemin . "</a>";
$danscsv = "In the folder " . substr($r['mount_point'], strlen($_SESSION["username"]) + 8) . $chemin;
}
if ($p[1] != $_SESSION["username"] && !$correspondance && strpos($r["path"], "groupfolders") === false) {
$dans = " - <a href=\"https://yourcloud.com/index.php/f/" . $r['parent'] . "\" target=\"_blank\">In the folder " . substr($r['mount_point'], strlen($_SESSION["username"]) + 8) . ".../" . $tchemin[sizeof($tchemin) - 2] . "/</a>";
$danscsv = "In the folder " . substr($r['mount_point'], strlen($_SESSION["username"]) + 8) . ".../" . $tchemin[sizeof($tchemin) - 2] . "/";
}
echo ($owner == "admin") ? "Group folder" : mb_strtoupper($owner) ?> - Last modification : <?= date('d/m/Y H:i', $r['mtime']); ?> <?= $dans ?></span> </p>
</div>
</a>
</div>
<?php
$_SESSION["csv"] .= $r['fileid'] . ";" . $r['name'] . ";" . (($owner == "admin") ? "Group folder" : mb_strtoupper($owner)) . ";" . date('d/m/Y H:i', $r['mtime']) . ";" . $danscsv . "\n";
}
// you may need to write this script
echo ("<a href=\"csv.php\">Click here to get this list in CSV format</a>");
?>
</div>
</div>
<?php } ?>
</div>
</div>
<?php } else {
// You should write here your own login form
$_SESSION["username"]="sample_user";
} ?>
<script type="text/javascript" src="js/jquery.min.js"></script>
<script type="text/javascript" src="js/moment.min.js"></script>
<script type="text/javascript" src="js/daterangepicker.min.js"></script>
<script type="text/javascript" src="js/autoComplete.min.js"></script>
<script type="text/javascript">
$(function() {
// double date picker
var start = moment().subtract(365, 'days');
var end = moment();
function cb(start, end) {
$('#reportrange span').html(start.format('DD/MM/YYYY') + ' - ' + end.format('DD/MM/YYYY'));
document.getElementById('date').value = start.format('DD-MM-YYYY') + '&' + end.format('DD-MM-YYYY');
}
$('#reportrange').daterangepicker({
startDate: start,
endDate: end,
ranges: {
"Today": [moment(), moment()],
'Last 30 days': [moment().subtract(29, 'days'), moment()],
'This month': [moment().startOf('month'), moment().endOf('month')],
'Last month': [moment().subtract(1, 'month').startOf('month'), moment().subtract(1, 'month').endOf('month')],
'3 last months': [moment().subtract(3, 'month'), moment()],
'This year': [moment().startOf('year'), moment().endOf('year')],
'Last year': [moment().subtract(365, 'days'), moment()],
'Whenever': [moment().subtract(100, 'years'), moment()]
}
}, cb);
cb(start, end);
});
</script>
</div>
</body>
</html>
Here is a search example with terms "numérique collectivité" performed in 2 seconds. You can see in the results list some filenames where terms are not in the same order than in the user input. You can see also that a filename containing "collectivités" with a trailing 's' is retrieved as well :
With nextcloud official unified search, these files are not retrieved, and it takes about a minute to finish...
Thanks a lot @Mer0me for sharing your investigations and your approach to bypass the issue. Search is really unusable if the users share a lot of files. Is there any specific reason why you are not interested in "fixing" the issue of unified search itself?
Thank you for your comment.
We were able to write a quick and dirty search engine to address our particular needs, but I'm not sure (read : I'm sure of the opposite) this approach is scalable and good enough to be used as a Nextcloud unified search replacement.
I'm glad to help the Nextcloud community to find solutions but :
But if this contribution can help to find why Nextcloud search engine is so slow, and if I can personally help to improve it, I will certainly do.
@Mer0me I’m very happy that you have commented about current issues of search and have done something about it. Of another thread with users asking for similar things, see #29614.
Agree with you on various points such as specifying search terms in any order, searching multiple fields, near instant results, etc. I think one should also be able to sort such by relevance, file name, date, etc. and have search re-sort near instantly.
Perhaps you wish to look into MySQL FTS. Such offers exactly what you wish for. Plus diacritic insensitive search so there is no need to remove accents, diacritics, or manage such. I think the main reason for such not being added yet is perhaps the files portion of Nextcloud has been somewhat neglected as more features are added to increase market share, profit, and merely keeping the business open, plus likely responding to competitors features by adding similar. Perhaps too that since Nextcloud supports multiple databases, such would have to be added for all DBs: SQLite, MySQL/MariaDB, and PostgreSQL. All offer such full-text search, not to be confused with FTS plugins and searching text contents, yet the code for each is different and perhaps no one on the team has done such before, is interested, or considers such important enough. My impression is that perhaps Nextcloud has no one on their team with decent enough database experience.
See ownCloud OCIS for an example of decent search implementation. It shows up in a full window, etc. It’s as if no one in the Nextcloud team uses file search on their own desktop, being able to search an entire file system with instant results, and wants the same on Nextcloud. I store 100+ GB of files and searching now really is what do I even say, for how long is it going to remain so poor? :)
https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html https://dev.mysql.com/doc/refman/8.0/en/fulltext-natural-language.html
Unified search of the last versions of Nextcloud (since 25.0 maybe) is now very quick. We always need to search terms in the same order, but the results are near instantly displayed, so it makes sense again.
Unified search of the last versions of Nextcloud (since 25.0 maybe) is now very quick. We always need to search terms in the same order, but the results are near instantly displayed, so it makes sense again.
Perhaps their way to make it seem quick is merely to show the first 5 results, and then require the user to select 'Load more results' each time to see another five.
Doesn't anyone have tens of thousands of files or more, and with a particular search string may potentially have dozens, hundreds, or more search results, and wants to see them all at once? Imagine Google, Amazon, your own file system, your contacts list on computer or phone, if all of them acted in the same way, showing five results, and then requiring you to select more each and everytime. Really doesn't anyone see the madness?
I'm currently testing https://github.com/nextcloud/server/pull/37061. This does really speed up things. I've posted my webdav-search results over there. Also for unified search the results are amazing. On my test instance, search timed out (>60s, gateway time-out). After applying the patch, search takes less than 2 seconds! I also did some tests on my productive system with an speed up from 42s to less than 2s.
I would appreciate it if people can also test https://github.com/nextcloud/server/pull/40555, it should achieve similar effects to https://github.com/nextcloud/server/pull/37061
@icewind1991 thanks for your efforts!
I did a quick test on one of my instances (updated to NC 27.1.1) and created 1,000 empty text files and shared them via talk. https://github.com/nextcloud/server/pull/40555 was not applied 100% because SearchBuilderTest.php doesn't seem to exist on 27.1.1.
Patch | Search time |
---|---|
Default (no patch): | 37,594 ms |
PR https://github.com/nextcloud/server/pull/37061 | 447 ms |
PR https://github.com/nextcloud/server/pull/40555 | 887 ms |
https://github.com/nextcloud/server/pull/37061 is finally faster in this scenario, but https://github.com/nextcloud/server/pull/40555 is still a major improvement compared to the default installation (42 times faster).
UPDATE: Forgot to add the logs to the search times above... query pattern_no patch (NC2711).log query pattern_PR37061.log query pattern_PR40555.log
@XueSheng-GIT thanks for the testing.
The resulting query isn't quite what I expected (I would expect to see path IN (...)
instead of path_hash IN (...)
) which could explain the difference in performance between the two patches.
Can you try applying the latest commit I pushed to the PR and see if that makes a difference
@icewind1991 I'm just only on my mobile. Thus, a bit limited with testing. But here you go with the updated PR (I did a new run for all variants):
Patch | Search time |
---|---|
Default (no patch): | 38,295 ms |
PR https://github.com/nextcloud/server/pull/37061 | 543 ms |
PR https://github.com/nextcloud/server/pull/40555 (updated) | 537 ms |
Seems to be a quite good speedup for the updated PR https://github.com/nextcloud/server/pull/40555
Logs to the search times above... query pattern_no-patch-2.log query pattern_PR37061-2.log query pattern_PR40555-2.log
Does the query now look like intended?
Yes, this looks as expected. Thanks again for the testing
@icewind1991 I did some further testing on one of my productive systems (NC27.1.1) which was always slow on search before using https://github.com/nextcloud/server/pull/37061 and wasn't able to notice any search speedup using https://github.com/nextcloud/server/pull/40555 (same patch version as used for previous test).
Patch | Search time |
---|---|
Default (no patch): | 33,260 ms |
PR https://github.com/nextcloud/server/pull/37061 | 467 ms |
PR https://github.com/nextcloud/server/pull/40555 (updated) | 30,047 ms |
Logs to the search times above... query pattern_no-patch.log query pattern_PR37061.log query pattern_PR40555.log
Any idea why there's no speedup in this case?
@icewind1991 Any idea why search is slow in my latest test https://github.com/nextcloud/server/issues/35776#issuecomment-1732223031?
Unfortunately on NC28 the patches provided in https://github.com/nextcloud/server/pull/37061 or https://github.com/nextcloud/server/pull/40555 are not compatible anymore. Default search of NC28 is still slow if a lot of shares are present.
Whereas I used https://github.com/nextcloud/server/pull/37061 in production because https://github.com/nextcloud/server/pull/40555 was still slow in some cases (see comment above https://github.com/nextcloud/server/issues/35776#issuecomment-1732223031)
@icewind1991 @starypatyk any plans to update the query optimization for NC28?
@XueSheng-GIT - I put my PR on hold in favor of #40555. Now, I do not know what to do next, as #40555 did not progress since September. :disappointed:
Now that https://github.com/nextcloud/server/pull/40555 was merged into master, I did some further testing on NC28.0.3rc2. Long story short, as already observed above (https://github.com/nextcloud/server/issues/35776#issuecomment-1732223031), https://github.com/nextcloud/server/pull/40555 does not improve things on my production instance. Whereas https://github.com/nextcloud/server/pull/37061 does.
Patch | Search time |
---|---|
Default (no patch): | 33,689 ms |
PR https://github.com/nextcloud/server/pull/37061 (updated to NC28) | 520 ms |
PR https://github.com/nextcloud/server/pull/40555 | 38,547 ms |
Logs to the search times above... query pattern_nc28_no-patch.log query pattern_nc28_PR37061.log query pattern_nc28_PR40555.log
@icewind1991 Thanks a lot for taking care of this matter and merging https://github.com/nextcloud/server/pull/40555 into master! I'm not really into detail about the approach of https://github.com/nextcloud/server/pull/40555, but on the first view https://github.com/nextcloud/server/pull/37061 does include the following pattern which may be the reason for the difference:
(("storage" = $25) AND ("path_hash" IN ($26, $27, ...)))
CC @starypatyk because it was his approach. Maybe he has some additional idea why https://github.com/nextcloud/server/pull/37061 is so much faster than https://github.com/nextcloud/server/pull/40555.
@XueSheng-GIT Thanks for your tests. :+1:
Indeed the query should be optimized as you describe, but for some reason this does not happen.
@icewind1991 I created a few simple tests that show the issue - please see my branch https://github.com/nextcloud/server/commits/query-optimizer-search-issue/. I am not sure, if I should create a PR from this branch, as it contains failing tests only. Feel free to use these tests, if you think they are valuable.
One of the problems is shown by a pair of tests: testComplexSearchPattern1
and testComplexSearchPattern2
. The query condition in the first one gets optimized correctly. The condition in the second one is logically equivalent, but the first clause storage eq 1
is simplified - it is not wrapped in a (redundant) AND
operator. This one is not optimized as expected.
Apparently the code in https://github.com/nextcloud/server/blob/34c9bfc767e1bafa7c708318e4050a5f4c9a772b/lib/private/Files/Search/QueryOptimizer/MergeDistributiveOperations.php#L23 is not prepared for such cases.
Two additional tests mimic the query condition created in the QuerySearchHelper::applySearchConstraints
method.
The testApplySearchConstraints1
method tries to behave exactly like the QuerySearchHelper
code.
In the second one testApplySearchConstraints2
I tried to wrap the storage eq 2
clause in an AND
operator, to make all elements of the higher-level OR
homogenous. Unfortunately this did not help - the query is still not optimized as expected.
Thanks for the testing, I'll try to look into things further.
https://github.com/nextcloud/server/pull/43975 fixes those tests
@icewind1991 Thanks for following up! Just did some further testing and your additional pull https://github.com/nextcloud/server/pull/43975 seems to do the trick on my instances!
For the sake of completeness, here are the new results on NC28.0.3.
Patch | Search time |
---|---|
Default (no patch): | 35,697 ms |
PR https://github.com/nextcloud/server/pull/37061 (updated to NC28) | 345 ms |
PR https://github.com/nextcloud/server/pull/40555 + https://github.com/nextcloud/server/pull/43975 | 359 ms |
Logs to the search times above... query pattern_nc28_no-patch.log query pattern_nc28_PR37061.log query pattern_nc28_PR40555+43975.log
I would say nothing to complain anymore. Seems we can close this issue once https://github.com/nextcloud/server/pull/43975 is merged. A backport to NC28 would be welcome (although I'm already used to the manual patches 😉).
Thanks again @starypatyk and @icewind1991 for your great work!
Closing since it's been merged for awhile now:
Seems we can close this issue once #43975 is merged.
⚠️ This issue respects the following points: ⚠️
Bug description
Global/Unified search for files and tags is slow if a lot of shares are present. Search result for fulltextsearch, collectives, talk, deck and mails popup nearly instantly, but result for files and tags takes ages until they appear (approx. 30 seconds). Tested accounts have approx. 600 shares (according to
oc_shares
share_with
column. Talk is used a lot for sharing photos, which is probably the main reason for the amount of shares.Postgres log shows slow query... see below.
Doing the same on a cloned server instance without these shares (shares removed), the search results for files and tags appear within 2 seconds.
Just want to reference https://github.com/nextcloud/server/issues/23835 which really improved things in regards to search speed. Unfortunately it seems this issue is not fixed if a lot of shares are present.
Steps to reproduce
Expected behavior
Search result for files and tags should appear as fast as possible, even if a lot of shares are received.
Installation method
Community Manual installation with Archive
Operating system
Debian/Ubuntu
PHP engine version
PHP 8.1
Web server
Apache (supported)
Database engine version
PostgreSQL
Is this bug present after an update or on a fresh install?
Updated to a major version (ex. 22.2.3 to 23.0.1)
Are you using the Nextcloud Server Encryption module?
None
What user-backends are you using?
Configuration report
List of activated Apps
Nextcloud Signing status
Nextcloud Logs
No response
Additional info
Postgres log shows slow search query for files and tags: slow_search.log