eikek / docspell

Assist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.
https://docspell.org
GNU Affero General Public License v3.0
1.65k stars 127 forks source link

Avoid deleting the collective-folder with dcs #1659

Open qdrop17 opened 2 years ago

qdrop17 commented 2 years ago

First of all: Awesome software! I highly appreciate it.

I have a very small issue that I don't manage to resolve:

I mounted a cifs directory into my dsc container, enabling it to fetch new scanned documents from my NAS. This works flawlessly. Unfortunately, the "--delete"-flag not only deletes the imported pdf-files but the collective directories (inside /opt/docs/[collective]) too. This is quite problematic as my scanner is sending the scans there. This will fail if the directory is not existent.

How can this behaviour be avoided?

image: docspell/dsc:latest
container_name: docspell-consumedir
command:
  - dsc
  - "-d"
  - "http://docspell-restserver:7880"
  - "upload"
  - "--matches"
  - "**/*.pdf"
  - "--traverse"
  - "--poll"
  - "120"
  - "--delete"
  - "-i"
  - "--not-matches"
  - "**/.*"
  - "--header"
  - "Docspell-Integration:xxx"
  - "/opt/docs"
eikek commented 2 years ago

Hi @qdrop17

hm, that sounds not so good :-) Do you mean that dsc deletes even the collective directories, like /opt/docs/[collective]? or only the ones inside?

The former would be a bug I think. The latter is currently on purpose to not leave empty directories around. But we can add a flag to keep empty directories when deleting files.

I think currently to avioid this, you could play with directory permissions such that dsc cannot delete them (and live with the errors in the logs). Or you could remove the --delete flag and run a second script periodically to delete all files that have been uploaded (see dsc cleanup --help) So it is likely that the cleaunp command is also deleting directories. In that case there is dsc file-exists and then use standard rm.

qdrop17 commented 2 years ago

Hi @eikek

yeah, I mean that the collective folder /opt/docs/[collective] gets deleted (along with all imported files in it). Only deleting all folders within /opt/docs/[collective] would be no issue.

I tried to limit the permissions, but this leads to an issue, that it would not complete an import and therefore wouldn't continue to poll.

I currently disabled the "--delete" feature and I have another cronjob that does the cleanup.

eikek commented 2 years ago

That is really a strange bug, I cannot see this behavior on my installation (but I don't use --poll). I'll have a deeper look at this.

qdrop17 commented 2 years ago

That is really a strange bug, I cannot see this behavior on my installation (but I don't use --poll). I'll have a deeper look at this.

Let me know what I can do to help you reproduce this issue.

What's important: The docs-volume is a cifs mount into the container. That's why I can't use "--watch" as cifs-mounts don't offer inotify.

eikek commented 2 years ago

Thank you! I'll come back to it if necessary. Just on a general note, if you haven't considered it yet: you could run the dsc process on the other machine (that exports the cifs), too. Then you could use --watch and a network mount wouldn't be necessary.