atgreen / paperless

Emacs-assisted PDF document filing
132 stars 11 forks source link

pdfgrep integration / profile selection #17

Open hokreb opened 5 months ago

hokreb commented 5 months ago

I have two idea's for new features:

1) Integration of pdfgrep Integrate search with pdfgrep (if installed) to search directly in the scan and root directory, I use the iphone app quickscan for getting my documents digital, which does automatic ocr integration, so most of my pdf's are text searchable, which is quite nice, if I search for an invoice of a special company.

2) It would be cool, if one could choose some kind of profile, eg I use paperles for both, company documents and private documents, so I could easily setup different default folders.

But nevertheless, really super useful tool!

atgreen commented 5 months ago

These are both interesting ideas.

  1. For pdfgrep, maybe you hit a prefix key, and then the command key (like 'd'), and then regexp, and then it will run that command (move directory) on all matching files? Or did you have something else in mind?

  2. How do you imagine this working, exactly?

Thanks for the feedback. Even though I wrote it myself, I had forgotten how useful it was until I tried it again recently!

On Fri, Feb 2, 2024 at 5:07 AM hokreb @.***> wrote:

I have two idea's for new features:

1.

Integration of pdfgrep Integrate search with pdfgrep (if installed) to search directly in the scan and root directory, I use the iphone app quickscan for getting my documents digital, which does automatic ocr integration, so most of my pdf's are text searchable, which is quite nice, if I search for an invoice of a special company. 2.

It would be cool, if one could choose some kind of profile, eg I use paperles for both, company documents and private documents, so I could easily setup different default folders.

But nevertheless, really super useful tool!

— Reply to this email directly, view it on GitHub https://github.com/atgreen/paperless/issues/17, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAV7CKDPYR5ZK5U2EKISKTYRS3F5AVCNFSM6AAAAABCWNASASVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEYTINJVHA2DQNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

hokreb commented 5 months ago

Oh, prefix commands I didn‘t used till now, I will try to check how they worked. The main Idea was just to type a shortcut, and then the search string could be typed in and than pdfgrep will be run in both directories scan/ root. But maybe your solution to invoke this with prefix commands is more than enough. I will try this first.

Regarding the profile, I can imagine to define a kind of list with a string as profile name (eg company or private) and then a list with the scan directory and the root directory. By a shortcut one can select the current active profile and the directories will be instantly updated. Here a primitive idea how a variable could look like: (setq profiles ((„private“ („scan1“ „root1“)) („company“ („scan2“ „root2))))

Your package is super useful, I tried to find a workflow for storing documents I get by normal post mail, and found paperless-ngx as an open source package, I think that is also a very good solution, but to big for me, to service this, I am sure I don‘t know in half a year how to set this up. And I need only do search for documents by the texts which should be in the pdf. With your package this is super easy, emacs is my day to day workhorse and everything I need is to search for keywords in the scanned pdf bills/documents, that’s really perfect!

hokreb commented 3 months ago

Hi, regarding the profile I have a code suggestion, I added a kbd 'p' for it to switch profiles, I defined a nested list, where each entry contains the elements profile-name and list of directories. By pressing p, one can choose between the existing profiles, change the vars paperless-capture-directory & paperless-root-directory, and rescans the directory structure (kbd g). By setting paperless-profile-active one can set the default selected profile. I added this by the emacs-startup-hook, but I don't think, this is right way to do it. If you find it useful, feel free to add it to your package. It doesn't interfere with existing setups, as it will be only used, if paperless-initialize-profile is called.

I implemented it by adding following code:


(setq paperless-profile-list
  '(("privat" ("/home/user/privat/doc-store" "/home/user/private/scans"))
    ("office" ("/home/user/office/doc-store" "/home/user/office/scans"))))

(defvar paperless-profile-active nil
  "The active paperless profile. Set this variable to the name of the profile you want to start with.")

(defun paperless-set-profile (profile)
  "Set the paperless profile based on the given PROFILE."
  (let* ((profile-data (assoc profile paperless-profile-list))
         (directories (cdr profile-data)))
    (unless profile-data
      (error "Profile %s not found" profile))
    (setq paperless-capture-directory (nth 1 (nth 0 directories))
          paperless-root-directory (nth 0 (nth 0 directories)))
    (setq paperless-profile-active profile) ; Update the active profile variable.
    (paperless-scan-directories)
    (message "Switched to profile: %s" profile)))

(defun paperless-choose-profile ()
  "Choose a paperless profile from a list and set it."
  (interactive)
  (let* ((profile-names (mapcar 'car paperless-profile-list))
         (profile (completing-read "Choose paperless profile: " profile-names nil t)))
    (paperless-set-profile profile)))

(define-key paperless-mode-map (kbd "p") 'paperless-choose-profile)

(defun paperless-initialize-profile ()
  "Initialize the paperless profile based on the 'paperless-profile-active' variable."
  (when paperless-profile-active
    (paperless-set-profile paperless-profile-active)))

;; Assuming your Emacs is already set up for paperless, you can add this line:
(add-hook 'emacs-startup-hook 'paperless-initialize-profile)
``