nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
26.91k stars 4.02k forks source link

Add expire mechanism for preview images #45962

Open alx-tuilmenau opened 3 months ago

alx-tuilmenau commented 3 months ago

How to use GitHub

Is your feature request related to a problem? Please describe. There are a lot of issues here targeting the grow of the preview folder in appdata.. I would suggest a mechanism to add a expire date for it.

Describe the solution you'd like

Describe alternatives you've considered At the moment, I use an ugly external script to delete images, oc_filecache entries and empty folders of preview images older then 12 months.

rdlab-upc commented 3 months ago

@alx-tuilmenau I totally agree with you. IMHO, there is a possible optimization for your proposal, instead of deleting all previews "pass due" at once, delete groups of 50/100/300 images every cron execution in order to decrease filesystem/DB stress.

Eventually, DB+filesystem will be "clean" without collateral issues. Most NC instances will have hundreds of users and thousands of previews...

alx-tuilmenau commented 3 months ago

@rdlab-upc of course, and this should be done during the maintenance time set in config. I discovered it first because I had to wait a long time for re-organizing the underlying filesystem (for other reasons) and noticed it needed days for the preview folder. My ugly cleanup script needed about half a day to delete ca. 300k images and 400k folders older than 1 year (but the storage is on Nearline HDDs, so it's not fast).

amessina commented 3 months ago

In the meantime, would you be willing to share your "ugly" script?

alx-tuilmenau commented 3 months ago

I was already thinking about this when I made the first post. The script (for CLI) probably only fits my environment and needs to be modified if you want to use it yourself. It assumes that mysql is used, the mimetype of a folder is 2, the table prefix is "oc_". The storage numeric_id of the folder with the previews must be entered at the beginning. And, it only works with a file storage. And apart from config.php, no APIs from Nextcloud are used. Maybe it will break something, be careful, have a backup.

The ugly script ```php getMessage()); die(); } function chk_dir($dirid) { global $db, $basedir, $pfxdir, $storage; $qcnt = $db->query('SELECT count(*) anz FROM oc_filecache WHERE parent='.$dirid.' AND storage='.$storage); $cnt = $qcnt->fetch(PDO::FETCH_ASSOC)['anz']; $qcnt->closeCursor(); if ($cnt!=0) return 0; if (($qfn = $db->query('SELECT fileid, path, parent FROM oc_filecache WHERE storage='.$storage.' AND mimetype=2 AND fileid=' . $dirid))==false) { l(1, 'rmdir_dbfetch',$dirid); return 0; }; $d = $qfn->fetch(PDO::FETCH_ASSOC); $qfn->closeCursor(); if (strpos($d['path'], $pfxdir)!==0) { l(1, 'rmdir_pfx',$dirid . " " .$d['path']); return 0; } if (!rmdir($basedir . '/' . $d['path'])) { l(1, 'rmdir_rmdir',$dirid . " " . $basedir . '/' . $d['path']); return 0; } if ($db->exec('DELETE FROM oc_filecache WHERE fileid=' . $d['fileid'])!=1) { l(1, 'rmdir_dbdir',$dirid . " " .$d['path']); return 0; } l(3, 'rmdir_ok',$dirid . " " .$d['path']); return chk_dir($d['parent']) + 1; } set_exception_handler('e'); if (($log = fopen($logname, "a+")) == NULL ) die('cannot create/open logfile'); require $configfile; $basedir = $CONFIG['datadirectory']; $pfxdir = 'appdata_' . $CONFIG['instanceid'] . '/preview'; l(2, 'Base dir', $basedir); l(2, 'Präfix', $pfxdir); l(2, 'DB', $CONFIG['dbname']); try { $db = new PDO($CONFIG['dbtype'] . ':host=' . $CONFIG['dbhost'] . ';dbname=' . $CONFIG['dbname'], $CONFIG['dbuser'], $CONFIG['dbpassword']); } catch (Exception $e) { l(0, 'DB ' , $e->getMessage()); die(); } $qfn = $db->query('SELECT fileid, path, size, mtime, parent FROM oc_filecache WHERE storage='.$storage.' AND mimetype>2 AND path LIKE "' . $pfxdir . '/%" AND mtime<' . time()-$month*31*24*60*60); if (!$qfn) { l(0, 'DB ', $db->errorInfo()[1]); die(); } $total = $qfn->rowCount(); l(2, '# of files',$total); $filecount = 0; $sizecount = 0; $dircount = 0; while($d = $qfn->fetch(PDO::FETCH_ASSOC)) { $fn = $basedir . '/' . $d['path']; $dir = $d['parent']; $sizecount += $d['size']; $filecount++; $err = []; if (strpos($d['path'], $pfxdir)!==0) { $err[] = 'prefix'; } elseif ($db->exec("DELETE FROM oc_filecache WHERE fileid=" . $d['fileid'])!=1) { $err[] = "delete_db"; } elseif (!file_exists($fn)) { $err[] = "not_found"; } elseif (!unlink($fn)) { $err[] = "unlink_fail"; } else { $dircount += chk_dir($dir); } $err = implode(' ', $err); l(($err=='')?3:1, ($err=='')?'Ok':$err, $fn . " (" . $d['fileid'] . ")"); if ($displaystatus & ($filecount % 50==0)) echo "--- $filecount/$total " . sprintf('%.2f', $filecount/$total*100) . "%, " . sprintf('%.1f', $sizecount/1024/1024) . ' MB, ' . "$dircount dirs, " . $fn . " ---\n"; } $db->exec('delete from oc_filecache_extended WHERE fileid IN (select b.fileid from oc_filecache a right join oc_filecache_extended b using(fileid) where a.fileid is null'); l(2, 'done',''); ```
amessina commented 3 months ago

@alx-tuilmenau thank you for sharing your script to demonstrate the concept.

amessina commented 3 months ago

I appreciate this feature request and am wondering if it would be worthwhile to consider using a preview file's atime rather than the mtime stored in the database. While it would certainly be more intensive, it would handle the case of a frequently accessed preview file, even if it may have been created a long time ago.

alx-tuilmenau commented 3 months ago

@amessina I have no atime, because the filesystem is mounted with noatime option for performance reasons. See my proposal in the first post for this. So maybe a other solution is to use find in the appdata folder to create a list of files (and maybe deleting the files and use a second find to find, list and delete the empty folders) and using these lists to find and delete the database entries (or use occ files:scan-app-data , but I cant use this, this will take weeks).