eikek / docspell

Assist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.
https://docspell.org
GNU Affero General Public License v3.0
1.66k stars 128 forks source link

[BETA] Import files and metadata from Paperless #358

Open totti4ever opened 4 years ago

totti4ever commented 4 years ago

Import script in https://github.com/eikek/docspell/blob/master/tools/import-paperless/

Data transferred:

Please be aware that existing information of existing documents will be overwritten. So when you already have a document in Docspell which is also in Paperless, it won't be added again (by checksum) but the attributes (title, tags, ...) will be overwritten!


After using Paperless for quite a while, I figured out that there is some room for improvement but only little work still done on the project, what is totally fine as it is a private and open-source project!

Still, I came around Docspell and found it to have quite a potential, especially regarding the AI and AI-like features growing. Still I wanted to transfer the tagging and structure from Paperless to Docspell and not only import the files and start over the managing process once again.

That is why I put in my dirty bash scripting skills and made a script, which reads the files from the internal documents folder of Paperless and extracts tags and correspondents from Paperless and imports them to Docspell using the official API, so no dirty DB writes or something like that!


Please, everybody who also comes from Paperless, try out this script! If in need of help, just ask here or on Gitter. And if you have suggestions for improvement tell me or even better make a pull request :-)

I will leave this ticket for discussion open

totti4ever commented 4 years ago

List of improvements:

zombiehoffa commented 2 years ago

So, for those of us using paperless with postgres, Is there a way to also do this?

eikek commented 2 years ago

So, for those of us using paperless with postgres, Is there a way to also do this?

I can't help with that, I'm afraid. The script was written by totti4ever a long time ago and he left this project since a while. The db schema is the same, so you can probably just exchange the sqlite calls with appropriate psql calls in the script.

Also, looking it #1241 there seems to be some problems with this script right now.