pulsejet / memories

Fast, modern and advanced photo management suite. Runs as a Nextcloud app.
https://memories.gallery
GNU Affero General Public License v3.0
3.21k stars 87 forks source link

indexing large files on external_storage uses 2x filesize in /tmp and can crash nextcloud by consuming all available disk space #1305

Open k1n6b0b opened 1 month ago

k1n6b0b commented 1 month ago

Is your feature request related to a problem? Please describe. Related to: https://github.com/nextcloud/all-in-one/discussions/5356

Describe the solution you'd like Better handling of /tmp space that doesnt result in out-of-disk-space errors on large files from external_storage

Describe alternatives you've considered Requesting the nextcloud-aio team implement controls on /tmp (see discussion 5356 aboe)

Additional context

Steps to reproduce

  1. Setup Nextcloud memories
  2. index any file on external_storage larger then 50% of your free disk space occ memories:index _Note: It appears this process copies the file from externalstorage, to /tmp and then copies it again in /tmp using 2x the filesize while processing
  3. sudo docker exec -it nextcloud-aio-nextcloud /bin/bash
  4. watch 'ls -lha /tmp;'

This file grows until it eventually fills my entire partition (~50GB)

54d900c19b86:/var/www/html# ls -lha /tmp/phpgABoHh 
-rw------- 1 www-data www-data 26G Oct  1 00:15 /tmp/phpgABoHh

Expected behavior

  1. Better handling on whatever that tmp file is by the memories app!
  2. Better protections/limits on /tmp by nextlcoud-aio

Actual behavior

Filesystem fills up database crashes database container is left hanging in nextcloud network space (requiring system docker restart) Upon restart, /tmp in the nextcloud-aio-nextcloud container is cleared and everything works until the rescan fills the filesystem again :)

Indexing folder /XXXXX/files/nas_photoAlbums
PHP Notice:  fwrite(): Write of 331 bytes failed with errno=28 No space left on device in /var/www/html/lib/private/Log/File.php on line 87
Failed to index folder /XXXXX/files/nas_photoAlbums: An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server

An exception occurred while executing a query: SQLSTATE[HY000]: General error: 7 no connection to the server

PHP Notice:  fwrite(): Write of 331 bytes failed with errno=28 No space left on device in /var/www/html/lib/private/Log/File.php on line 87

Error response from daemon: endpoint with name nextcloud-aio-database "already exists in network nextcloud-aio"

Doctrine\DBAL\Exception: Failed to connect to the database: An exception occurred in the driver: SQLSTATE[08006] [7] connection to server at "nextcloud-aio-database" (172.18.0.5), port 5432 failed: Host is unreachable
    Is the server running on that host and accepting TCP/IP connections? in /var/www/html/lib/private/DB/Connection.php:167

My workaround: I've increased the /var/lib/docker partition to be > 2x the largest file I have on external_storage. This however took a lot of time to debug, error handling/logging could be improved to better show the user the issue. Hoping between this team and the nextcloud-aio team the next user with this issue can debug much faster :)

Other information

Host OS

Ubuntu 24.04

Nextcloud Hub 8 (29.0.7)

Output of sudo docker info

Client: Docker Engine - Community
 Version:    27.3.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.17.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.29.7
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 13
  Running: 13
  Paused: 0
  Stopped: 0
 Images: 17
 Server Version: 27.3.1
 Storage Driver: overlay2
  Backing Filesystem: zfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: syslog
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
 runc version: v1.1.14-0-g2c9f560
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.8.0-45-generic
 Operating System: Ubuntu 24.04 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31.34GiB
 Name: nextcloud-aio
 ID: 106c73e1-6e35-44d3-bb0f-6b4b8d772ecf
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Docker run command or docker-compose file that you used

sudo docker run \
--init \
--sig-proxy=false \
--name nextcloud-aio-mastercontainer \
--restart always \
--publish 8080:8080 \
--env APACHE_PORT=11000 \
--env APACHE_IP_BINDING=0.0.0.0 \
--volume nextcloud_aio_mastercontainer:/mnt/docker-aio-config \
--volume /var/run/docker.sock:/var/run/docker.sock:ro \
--env NEXTCLOUD_DATADIR="/mnt/ncdata" \
--env NEXTCLOUD_MEMORY_LIMIT=8192M \
--add-host nextcloud.XXX.XXX:10.XXX.XXX.XXX \
nextcloud/all-in-one:latest

Other valuable info