nextcloud / fulltextsearch

🔍 Core of the full-text search framework for Nextcloud
https://apps.nextcloud.com/apps/fulltextsearch
GNU Affero General Public License v3.0
216 stars 51 forks source link

Duplicate indexing by FullTextSearch (resource and security issue) #878

Open ga-it opened 1 month ago

ga-it commented 1 month ago

How to use GitHub


Steps to reproduce

We have a Nextcloud server with a large number of group folders with over 1,8m documents and over 100 users.

These are indexed individually for every user, resulting in massive duplication, time to index, etc.

Further, as permissions change, this can result in security issues related to past access.

This mirrors the issue raised in this post, but for which I am unable to find a bug report or feature request: https://help.nextcloud.com/t/handling-of-shared-folders-in-fulltext-search/172909/4

It also mirrors an issue with the embedding process in the Context Chat Backend: https://github.com/nextcloud/context_chat_backend/issues/49

Expected behaviour

It would seem a solution would be to set up roles for group directories and then users attached to these roles based on those attached to the group folders using Elasticsearch's document and field level security:

https://www.elastic.co/guide/en/elasticsearch/reference/current/field-and-document-access-control.html

Actual behaviour

Tell us what happens instead, if possible also add a screenshot

Server configuration

Version Nextcloud Hub 9 (30.0.0) Dockerised Operating System: Linux 6.10.6-amd64 x86_64 CPU: Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz (38 threads) Memory: 152.71 GB

Php Version: 8.2.23 Memory limit: 10 GB Max execution time: 3600 Upload max size: 10 GB OPcache Revalidate Frequency: 60 Extensions: Core, date, libxml, openssl, pcre, sqlite3, zlib, ctype, curl, dom, fileinfo, filter, hash, iconv, json, mbstring, SPL, session, PDO, pdo_sqlite, standard, posix, random, Reflection, Phar, SimpleXML, tokenizer, xml, xmlreader, xmlwriter, mysqlnd, apache2handler, apcu, bcmath, exif, ftp, gd, gmp, imagick, intl, ldap, memcached, pcntl, pdo_mysql, pdo_pgsql, redis, sodium, sysvsem, zip, Zend OPcache

Database Type: pgsql Version: PostgreSQL 15.8 (Debian 15.8-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit Size: 1.6 GB

List of activated apps ``` If you have access to your command line run e.g.: sudo -u www-data php occ app:list from within your Nextcloud installation folder ```
Nextcloud configuration ``` { "system": { "htaccess.RewriteBase": "\/", "memcache.local": "\\OC\\Memcache\\APCu", "allow_local_remote_servers": true, "apps_paths": [ { "path": "\/var\/www\/html\/apps", "url": "\/apps", "writable": false }, { "path": "\/var\/www\/html\/custom_apps", "url": "\/custom_apps", "writable": true } ], "memcache.distributed": "\\OC\\Memcache\\Redis", "memcache.locking": "\\OC\\Memcache\\Redis", "redis": { "host": "***REMOVED SENSITIVE VALUE***", "password": "***REMOVED SENSITIVE VALUE***", "port": 6379 }, "overwritehost": "nextcloud.globaladvisors.biz", "overwriteprotocol": "https", "trusted_proxies": "***REMOVED SENSITIVE VALUE***", "upgrade.disable-web": true, "passwordsalt": "***REMOVED SENSITIVE VALUE***", "secret": "***REMOVED SENSITIVE VALUE***", "trusted_domains": [ "localhost", "nextcloud.globaladvisors.biz" ], "datadirectory": "***REMOVED SENSITIVE VALUE***", "dbtype": "pgsql", "version": "30.0.0.14", "overwrite.cli.url": "https:\/\/nextcloud.globaladvisors.biz", "dbname": "***REMOVED SENSITIVE VALUE***", "dbhost": "***REMOVED SENSITIVE VALUE***", "dbport": "", "dbtableprefix": "oc_", "dbuser": "***REMOVED SENSITIVE VALUE***", "dbpassword": "***REMOVED SENSITIVE VALUE***", "installed": true, "instanceid": "***REMOVED SENSITIVE VALUE***", "maintenance_window_start": 1, "default_phone_region": "ZA", "enabledPreviewProviders": [ "OC\\Preview\\BMP", "OC\\Preview\\GIF", "OC\\Preview\\JPEG", "OC\\Preview\\Krita", "OC\\Preview\\MarkDown", "OC\\Preview\\MP3", "OC\\Preview\\OpenDocument", "OC\\Preview\\PNG", "OC\\Preview\\TXT", "OC\\Preview\\XBitmap" ], "preview_imaginary_url": "***REMOVED SENSITIVE VALUE***", "preview_concurrency_all": "12", "preview_concurrency_new": "8", "loglevel": 2, "maintenance": false, "mail_from_address": "***REMOVED SENSITIVE VALUE***", "mail_smtpmode": "smtp", "mail_sendmailmode": "smtp", "mail_domain": "***REMOVED SENSITIVE VALUE***", "mail_smtphost": "***REMOVED SENSITIVE VALUE***", "mail_smtpport": "25", "preview_max_memory": 1024, "preview_max_filesize_image": 200, "preview_max_x": 2048, "preview_max_y": 2048, "twofactor_enforced": "true", "data-fingerprint": "xxxxxxxxxxxxxxxxxxx", "skeletondirectory": "", "secure_view": { "enabled": true, "hide_download": true, "hide_print": true }, "twofactor_enforced_groups": [], "twofactor_enforced_excluded_groups": [ "Clients" ], "app_install_overwrite": [ "twofactor_email", "files_antivirus", "appointments", "carnet", "forms", "gptfreeprompt", "thesearchpage", "timemanager", "workflow_ocr", "workspace", "stt_whisper", "epubviewer", "fulltextsearch_elasticsearch", "files_scripts", "side_menu", "workflow_kitinerary", "workflow_media_converter", "files_trackdownloads" ], "updater.release.channel": "stable", "memories.db.triggers.fcu": true, "memories.exiftool": "\/var\/www\/html\/custom_apps\/memories\/bin-ext\/exiftool-amd64-glibc", "memories.vod.path": "\/var\/www\/html\/custom_apps\/memories\/bin-ext\/go-vod-amd64", "has_rebuilt_cache": true } } ```

Browser

Browser name: Firefox/Chrome/Safari/…

Browser version: 124/125/…

Operating system: Windows/Ubuntu/Mac/…

Browser log ``` Insert your browser log here, this could for example include: a) The javascript console log b) The network log c) ... ```
yyuueexxiinngg commented 1 month ago

Seems related: https://github.com/nextcloud/files_fulltextsearch/issues/281