pwmt / zathura

Document viewer
https://pwmt.org/projects/zathura
zlib License
2.05k stars 142 forks source link

[FR] Expose index of the current document through dbus_interface.c #399

Closed sebastinas closed 5 months ago

sebastinas commented 1 year ago

On GitLab by @rmorales on Jun 22, 2023, 16:26


Expose table of contents through dbus

By calling the functions defined in dbus-interface.c, I have been able to write Elisp functions (please see below) in order to be able to reuse Zathura instances when opening Org Mode links of the form [[file:<<filename>>#<<page number>> (i.e. the same Zathura instance will be used if it is already opening the desired document.)

(defun my/dbus-zathura-use-instance-or-open-new (file link)
    "Launch zathura to open FILE or Reuse a running instance.

LINK is provided by Org Mode when this function has been
specified in `org-file-apps'. This function is intended to be
used in `org-file-apps' and shouldn't be called on its own."
  (let* ((process-alist (proced-process-attributes))
         (pids (mapcar 'car (proced-filter process-alist '((comm . "zathura")))))
         (pid
          (catch 'found
            (cl-loop for pid in pids
                     when (equal
                           (dbus-get-property
                            :session
                            (concat "org.pwmt.zathura.PID-" (number-to-string pid))
                            "/org/pwmt/zathura"
                            "org.pwmt.zathura"
                            "filename")
                           file)
                     do (throw 'found pid))))
         (page-number
          (or
           (progn
             ;; FIXME: According to the docstring of org-file-apps, we
             ;; can access capture groups using (match-string n
             ;; link). However, when I tried it, the capture groups
             ;; didn't have the entire content. For this reason, I'm
             ;; running string-match with the same regex that I used
             ;; in org-file-apps, but (match-string n link) should
             ;; work as specified in the Org Mode documentation.
             (string-match "\\.\\(pdf\\|djvu\\)::\\([0-9]+\\)\\'" link)
             (match-string 2 link))
           "1")))
    (if pid
        ;; Jump to that page in the existing instance
        (dbus-call-method
         :session
         (concat "org.pwmt.zathura.PID-" (number-to-string pid))
         "/org/pwmt/zathura"
         "org.pwmt.zathura"
         "GotoPage"
         (1- (string-to-number page-number)))
      ;; Open a new instance
      (make-process
       :name "zathura"
       :buffer nil
       :command `("zathura" "--page" ,page-number ,file)))))
(setq org-file-apps
      '((auto-mode . emacs)
        (directory . emacs)
        ("\\.png\\'" . "mpv %s")
        ("\\.\\(pdf\\|djvu\\)" . my/dbus-zathura-use-instance-or-open-new)
        ("\\.\\(pdf\\|djvu\\)::\\([0-9]+\\)\\'" . my/dbus-zathura-use-instance-or-open-new)))

Now, I would like to create a function for Emacs which prompts for an item in the table of contents of the document and jumps to that specific page. Currently, dbus-interface.c expose the page number and the filename and it provides an interface for jumping to an specific page (interface which I used in my Elisp function shown above). However, the table of contents from the current document is not exposed. Therefore, I propose that the index of the document is exposed in some way.

This feature will not only benefit Emacs users but also users of any completion framework (e.g. rofi, dmenu, fzf, vim, etc), since such specialized completion frameworks have more features available (e.g. regex search, movement with operators, fuzzy finding) which, I believe, might require significant effort to incorporate in the zathura index viewer.

sebastinas commented 1 year ago

On GitLab by @rmorales on Jun 22, 2023, 16:28


I believe one way to do this would be to expose the table of contents as a JSON object. Please see examples below.

2023_06_22_09_28_21_646511683_-05

{
  {
    "name": "foo a",
    "page_number": 1
  },
  {
    "name": "foo b",
    "page_number": 10
  },
  {
    "name": "foo c",
    "page_number": 20
  },
}

2023_06_22_09_23_59_403462891_-05

{
  {
    "name": "foo a",
    "page_number": 1,
    "children": {
      {
        "name": "foo b",
        "page_number": 5
      },
      {
        "name": "foo c",
        "page_number": 10
      }
    }
  },
  {
    "name": "foo d",
    "page_number": 11,
    "children": {
      {
        "name": "foo e",
        "page_number": 15
      },
      {
        "name": "foo g",
        "page_number": 20
      }
    }
  },
}

Just in case this is useful for developers. Here are the commands that I used for generating the PDFs shown in the two screenshots above.

PDF shown in the first screenshot:

labels=""
for i in $(seq 1 20); do labels="$labels label:a"; done
eval convert \
  -density 1000 \
  -font "$HOME/.fonts/DejaVuSerif.ttf" \
  "$labels" \
  input.pdf
cat << EOF > metadata.txt
[/Page 1 /Title (foo a) /OUT pdfmark
[/Page 10 /Title (foo b) /OUT pdfmark
[/Page 20 /Title (foo c) /OUT pdfmark
EOF
gs -o output.pdf -sDEVICE=pdfwrite input.pdf metadata.txt

PDF shown in the second screenshot:

labels=""
for i in $(seq 1 20); do labels="$labels label:a"; done
eval convert \
  -density 1000 \
  -font "$HOME/.fonts/DejaVuSerif.ttf" \
  "$labels" \
  input.pdf
cat << EOF > metadata.txt
[/Page 1 /Title (foo a) /Count 2 /OUT pdfmark
[/Page 5 /Title (foo b) /OUT pdfmark
[/Page 10 /Title (foo c) /OUT pdfmark
[/Page 11 /Title (foo d) /Count 2 /OUT pdfmark
[/Page 15 /Title (foo e) /OUT pdfmark
[/Page 20 /Title (foo f) /OUT pdfmark
EOF
gs -o output.pdf -sDEVICE=pdfwrite input.pdf metadata.txt
sebastinas commented 9 months ago

mentioned in commit 3a3e03999ad313cf86a6d4c653eafdc1e5ae358b