Closed cstrand89 closed 1 year ago
Here is the compose file and env file I been using for the last week while testing.
Where do I place this these files @jchus-id do you have a quick quide?
Working on. :) Almost ready!.
Merged in saltyorg/Sandbox#117
URLS for upstream and docker information should be included in the documentation for each role.
Been playing with this to see what the best setup would be. I have some suggestions.
PAPERLESS_MEDIA_ROOT
a setting. I personally don't want this in the opt folder and i think other people would probably think the same. Instead, it would probably be somewhere on the /mnt directory.PAPERLESS_CONSUMPTION_DIR
. This could probably be anywhere, but i am experimenting with having a consume directory on my google drive. Maybe this just can just be left as-is though.Keep the database in opt, but point to the media folder of choice is what it comes down to.
Been playing with this to see what the best setup would be. I have some suggestions.
1. remove config. Its not used for the docker image and need to use envs https://paperless-ngx.readthedocs.io/en/latest/configuration.html 2. Maybe make `PAPERLESS_MEDIA_ROOT` a setting. I personally don't want this in the opt folder and i think other people would probably think the same. Instead, it would probably be somewhere on the /mnt directory. 3. Same for `PAPERLESS_CONSUMPTION_DIR`. This could probably be anywhere, but i am experimenting with having a consume directory on my google drive. Maybe this just can just be left as-is though.
Keep the database in opt, but point to the media folder of choice is what it comes down to.
Has this been implemented? I am hesitant to utilise it until the option to store files in /mnt works.
Will take a look asap
Been playing with this to see what the best setup would be. I have some suggestions.
1. remove config. Its not used for the docker image and need to use envs https://paperless-ngx.readthedocs.io/en/latest/configuration.html 2. Maybe make `PAPERLESS_MEDIA_ROOT` a setting. I personally don't want this in the opt folder and i think other people would probably think the same. Instead, it would probably be somewhere on the /mnt directory. 3. Same for `PAPERLESS_CONSUMPTION_DIR`. This could probably be anywhere, but i am experimenting with having a consume directory on my google drive. Maybe this just can just be left as-is though.
Keep the database in opt, but point to the media folder of choice is what it comes down to.
Has this been implemented? I am hesitant to utilise it until the option to store files in /mnt works.
You can do this today with the way saltbox is configured. It's not done by default for this. I can follow-up with my findings in a bit and show my config.
Of course you can still override env specify custom path, its working out of the box with saltbox ;)
This my current config:
/srv/git/saltbox/inventories/host_vars/localhost.yml
paperless_ngx_docker_envs_custom:
PAPERLESS_CONSUMPTION_DIR: /mnt/unionfs/Documents/consume
PAPERLESS_MEDIA_ROOT: /mnt/protected/unionfs/Documents/paperless
PAPERLESS_CONSUMER_POLLING: "5"
PAPERLESS_TASK_WORKERS: "4"
PAPERLESS_THREADS_PER_WORKER: "4"
PAPERLESS_FILENAME_FORMAT: "{created_year}/{correspondent}/{created_year}-{created_month}-{created_day}_{title} ({document_type}) [{tag_list}]"
Couple things to note here. I wanted to have an encrypted folder using rclone. So, I have my mount point under /mnt/protected. You can definitely put i under /mnt/unionfs/Media/Documents
for example if you wanted to and not have to do much. The other thing i wanted to do is sync with smaller files. So, I added another folder in cloudplow.
/opt/cloudplow/config.json
Under remotes, I added:
"remotes": {
"protected_documents": {
"hidden_remote": "protected:",
"rclone_command": "move",
"rclone_excludes": [
"**partial~",
"**_HIDDEN~",
"*.db",
"media.lock"
],
"rclone_extras": {
"--checkers": 16,
"--drive-chunk-size": "1M",
"--drive-stop-on-upload-limit": null,
"--low-level-retries": 2,
"--retries": 1,
"--skip-links": null,
"--stats": "60s",
"--transfers": 8,
"--user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36",
"--verbose": 1
},
"rclone_sleeps": {
" 0/s,": {
"count": 16,
"sleep": 25,
"timeout": 62
},
"Failed to copy: googleapi: Error 403: User rate limit exceeded": {
"count": 10,
"sleep": 25,
"timeout": 7200
}
},
"remove_empty_dir_depth": 2,
"sync_remote": "protected:/Documents",
"upload_folder": "/mnt/protected/local/Documents",
"upload_remote": "protected:/Documents"
}
}
Under uploader section, I added:
"uploader": {
"protected_documents": {
"check_interval": 1,
"exclude_open_files": false,
"max_size_gb": 0,
"opened_excludes": [],
"service_account_path": "",
"size_excludes": []
}
}
Again, if you put it under Media, can do google:/Media
instead of protected:/Documents
. So far it works pretty well. I can add files into the consume folder. This can be locally on the box or if you use google drive, a folder in google drive. Just drop it in there and consumes it. I've noticed sometimes it gets stuck, so ill restart the service and then it works ok after. Loading documents is a bit slow to load thumbnails, but not too bad. Tagging is also a bit laggy, but not big deal for me.
The one thing doesn't seem to work is filesystem stats for consuming. I just added PAPERLESS_CONSUMER_POLLING
and it works fine.
If you want to know how I have the encrypted drive setup, I can go through that as well. Short version is, I just set that up manually with rclone. Copied the rclone systemd file to mount it and copied the mergerfs systemd file to create the unionfs directory.
@maximuskowalski Would you like the google bits included in docs?
Pull #117
@maximuskowalski Would you like the google bits included in docs?
In general, I am aiming to add the minimum amount of information needed to get the role installed and supply links to the official documentation if it exists. If there is other information or a short story that I want to include I might add a section after the more or less standard template ( I think you picked up on one of my Find Replace APPNAME
mistakes :) ). Extra info and tips are great but not important. Usually if I am doing docs I have a bunch to do so I'm just trying to get something in place.
A community-supported supercharged version of paperless: scan, index and archive all your physical documents.
Example docker compose: https://github.com/paperless-ngx/paperless-ngx/blob/main/docker/compose/docker-compose.postgres-tika.yml