ciur / papermerge

Open Source Document Management System for Digital Archives (Scanned Documents)
https://papermerge.com
Apache License 2.0
2.41k stars 257 forks source link

Deleting nodes does not get rid of underlying filesystem folders #607

Open bluekitedreamer opened 3 months ago

bluekitedreamer commented 3 months ago

First and foremost, this app is sick awesome job man. I've been playing with it the last few hours and hooking a few other apps to it's API for secondary storage, search, and organization.

Description

Instead of using docker volumes I map the core_app folder directory to a mount on the filesystem using docker mounts, for various reasons. When deleting nodes/files in papermerge I noticed that the folders created for the nodes are still present.

Info:

The folder structure will still persist. For example below is a tree on the core_app folder. Is this expected functionality? (see code below)

side question Also I'm curious as to the reasoning of the folder structures. I can understand the need to provide the GUID folder for uniqueness of uploaded files, file versioning, merging etc, but the extra folder structures on top of the GUID folder seems weird to me. My line of questioning is not insinuating or implying the implementation is wrong, if it works, it ain't stupid. I'm simply curious here.

It seems to me applying only the GUID folder should be sufficient, and would allow for easier manual decorruption in the event something catastrophic happened and you needed to piece an instance back together.

├── docvers
│   ├── 12
│   │   └── e9
│   ├── 2b
│   │   └── b2
│   ├── 3e
│   │   └── 76
│   ├── 4e
│   │   └── e8
│   ├── 5a
│   │   └── ae
│   ├── 5f
│   │   └── 94
│   ├── 65
│   │   └── ce
│   ├── 79
│   │   └── 6a
│   ├── 7c
│   │   └── c9
│   ├── 85
│   │   └── e7
│   ├── c2
│   │   └── 7f
│   ├── cc
│   │   └── 64
│   ├── d2
│   │   └── 67
│   ├── da
│   │   └── b4
│   ├── e6
│   │   └── 7d
│   ├── f8
│   │   └── 76
│   ├── fb
│   │   └── 7c
│   └── ff
│       └── ec
── ocr
│   └── pages
│       ├── 11
│       │   └── fc
│       ├── 1f
│       │   ├── 10
│       │   └── 7b
│       ├── 2c
│       │   └── c8
│       ├── 3f
│       │   └── 82
│       ├── 45
│       │   └── 1e
│       ├── 49
│       │   └── 5c
│       ├── 51
│       │   └── cc
│       ├── 5d
│       │   └── 17
│       ├── 5e
│       │   └── 91
│       ├── 73
│       │   └── fd
│       ├── 76
│       │   ├── 11
│       │   └── 7f
│       ├── 81
│       │   └── 5e
│       ├── 8f
│       │   └── df
│       ├── 9b
│       │   └── 96
│       ├── 9f
│       │   └── 6b
│       ├── a5
│       │   ├── 5e
│       │   └── 6f
│       ├── bf
│       │   └── bc
│       ├── c0
│       │   └── 19
│       ├── f3
│       │   └── 4f
│       ├── f4
│       │   └── 58
│       └── fa
│           └── a8
└── thumbnails
    └── jpg
        ├── 06
        │   └── 46
        ├── 1d
        │   └── 7a
        ├── 3f
        │   └── 82
        ├── 55
        │   └── 04
        ├── 5d
        │   └── 17
        ├── 5e
        │   └── 91
        ├── 76
        │   └── 7f
        ├── 84
        │   └── 1f
        ├── 8f
        │   └── df
        ├── 93
        │   └── e7
        ├── 95
        │   └── da
        ├── a5
        │   ├── 5e
        │   └── dd
        ├── bb
        │   └── 1d
        ├── bf
        │   └── bc
        ├── f4
        │   └── 58
        ├── fa
        │   └── a8
        └── fd
            └── 4d
ciur commented 3 months ago

Thank you for opening the ticket.

Regarding your question about extra folder on top of GUID folder.

The reason is to reduce number of file system nodes (files or folders) in a specific folder. Example: let's say you have 120,000 pages; then with just GUID folder, the pages folder will contain 120,000 entries! The problem is that usually there is a limit of number of subfolders on fiven file system. By adding one extra folder, with two digits of the UUID, the limitation is reduced by factor of 256. Thus if you have 120,000 pages, on the file system there will be max 120,000 / 256 ~ 468 folders.

bluekitedreamer commented 3 months ago

Gotcha, you're shooting for maximum comptability all around here, makes sense, I like it.

I'd normally be the advocate of lets just change the OS limitation, but as an app developer trying to support as many situations as possible I understand the mindset and decision.

bluekitedreamer commented 3 months ago

Also I had the version wrong I'm using 3.1 not 3.0.1, I fixed it above