cgwire / zou

Zou is the Kitsu API. It allows you to store and manage your production data
https://zou.cg-wire.com
GNU Affero General Public License v3.0
170 stars 104 forks source link

Deleting a project does not delete files on disk. #868

Open flinfo opened 1 month ago

flinfo commented 1 month ago

Context

Studio name: Flux Zou version: 0.19.12 Zou installation type: self-hosted

Describe the bug When a project is closed and then subsequently deleted, the associated files in /opt/zou/previews/ are not removed.

Expected behavior I would expect all files on disk that are associated to a project would be deleted when the project is deleted. This may not actually be the intended/expected behaviour?

This does bring up the idea that exporting a project (db and associated files) would be useful. It would seem most kitsu rustlers/managers will need/want this facility eventually.

Note. Environment="REMOVE_FILES=True" is present in zou.service as per issue 648 (https://github.com/cgwire/zou/issues/648)

In the meantime, Claude produced a python script that can delete orphaned files that are no longer associated with any projects ("id" column in "preview_file" table vs files on disk). DB name, username and pw need changing, and it currently just prints orphaned filenames, but uncomment the "os.remove(file)" line to action the deleting.

zou_deleteOrphanedFiles.zip

frankrousseau commented 1 month ago

I add the Claude code directly here:

import os
import psycopg2
from psycopg2 import sql

def get_db_ids():
    conn = psycopg2.connect(
        dbname="zoudb",
        user="your_username",
        password="your_password",
        host="localhost"
    )
    cur = conn.cursor()

    cur.execute("SELECT id FROM preview_file")
    db_ids = set(row[0] for row in cur.fetchall())

    cur.close()
    conn.close()

    return db_ids

def find_orphaned_files(root_dir, db_ids):
    orphaned_files = []
    thumbnail_path = os.path.join(root_dir, "pictures", "thumbnails")

    for dirpath, dirnames, filenames in os.walk(root_dir):

        for filename in filenames:
            full_path = os.path.join(dirpath, filename)
            file_id, _ = os.path.splitext(filename)
            if file_id not in db_ids:
                orphaned_files.append(full_path)
    return orphaned_files

def main():
    root_dir = "/opt/zou/previews/"
    db_ids = get_db_ids()
    orphaned_files = find_orphaned_files(root_dir, db_ids)

    print(f"Found {len(orphaned_files)} orphaned files:")
    for file in orphaned_files:
        print(file)
        # Uncomment the following line to delete the files
        # os.remove(file)

if __name__ == "__main__":
    main()

It should work but can be optimized by using a map to store ids and test them.