protesilaos / denote

Simple notes for Emacs with an efficient file-naming scheme
https://protesilaos.com/emacs/denote
GNU General Public License v3.0
545 stars 55 forks source link

Using groups for matching regex and other hacks #336

Open haji-ali opened 7 months ago

haji-ali commented 7 months ago

Apologies if this has been raised before. My limited search did not turn anything up.

I am trying to introduce a custom filetype for denote (for tex files), and I found the hackability of denote somewhat limiting. In particular, the way that denotes searches for titles and tags is by looking for a line starting with title-key-regexp and keywords-key-regexp, respectively, and then using the rest of the line as the value. In a tex file, I want my front matter to have a custom form that does not follow a key/value pairing (this is easy to achieve in TeX). I think a more general way to do this would be to use match groups in the title and keywords regex and use those to extract such values.

On the opposite end, front-matter imposes an order of the fields which is not always desirable. Is there a reason why denote doesn't implement a more flexible text-replacement strategy (For example, replacing %(title) with the title wherever it appears -- AUCTeX has a similar functionality).

jeanphilippegg commented 7 months ago

Denote is an evolving project. With regard to the reordering, you can check this section in the manual. You can use the "%1$s", "%2$s", etc. syntax to reorder the elements of the front matter.

I agree that the front-matter specification could benefit from more generality. In some of my experimental code, I have this:

(defvar denote-file-types
  `((org
     :extension ".org"
     :front-matter-start ""
     :front-matter-end "\n"
     :front-matter-components
     ((title
       :line-format "#+title:      <value>\n"
       :value-set-function identity
       :value-get-function denote-trim-whitespace)
      (date
       :line-format "#+date:       <value>\n"
       :value-set-function denote-date-org-timestamp
       :value-get-function denote-trim-whitespace)
      (keywords
       :line-format "#+filetags:   <value>\n"
       :value-set-function denote-format-keywords-for-org-front-matter
       :value-get-function denote-extract-keywords-from-front-matter))
     :link-format ,denote-org-link-format
     :link-in-context-regexp ,denote-org-link-in-context-regexp)))

This is not perfect and it is just an experiment, but it gives the kind of generality that is achievable in theory. You can see the use of "\<value>" instead of a regexp or "%s" and the order would be simply derived from the order of specification.

However, I don't believe that there are plans to work on this part of the code for the next version of Denote (3.0.0). The next version should be more about the generalization of the file naming scheme with its creation and renaming commands with a few other specific features.

Do you have an example of the kind of front-matter that you would like to have in .tex files? This would give a better idea of what needs to be changed.

haji-ali commented 7 months ago

Sure, the case I have is that in TeX I want to define a command for a new denote entry as such

\newentry[TAGS]{TITLE}

for example

\newentry[emacs, denote]{My first
note with a newline}

Note that adding any \n in the title should be filtered out in title-value-reverse-function and replaced with a single space (this is how TeX would render such text).

I didn't know about the syntax for ordering fields in the front matter, so now I can achieve this with :front-matter as follows

\newentry[%3$s]{%1$s}

However, the parsing would still fail. I was hoping I can specify a regex along the lines (modulo escaping some characters)

\newentry[(?2:[^\\]*)]{(?1:[^}])}

to extract the title and the tags.

Note that I am currently able to hack my way around this limitation by setting both title-key-regexp and keywords-key-regexp to ^\\newentry and do the regex matching in the *-reverse-* functions though this would still not allow newlines in the title.

As an aside, is there any reason to include id, date and tags in the front matter (since this info is already in the filename)? I see from the link you provided that removing id and date is sanctioned with an example (though I am unsure if this is best practice) but it seems that denote expects the tags to be in the front matter in any case.

protesilaos commented 7 months ago

From: Abdul-Lateef Haji-Ali @.***> Date: Wed, 1 May 2024 02:09:58 -0700

[... 28 lines elided]

As an aside, is there any reason to include id, date and tags in the front matter (since this info is already in the filename)? I see from the link you provided that removing id and date is sanctioned with an example (though I am unsure if this is best practice) but it seems that denote expects the tags to be in the front matter in any case.

Denote works fine without front matter. The reason we add this data is because other programs can make use of it. For example, Org uses the

+filetags as tags for each heading inside the file. An HTML export

backend can, for instance, generate tag pages and feeds based on this.

Taking a step back, I wonder it makes sense to use an abnormal hook at the end of the 'denote' function. This hook will run its functions by passing to the arguments that the 'denote' evaluates. In other words, it would be exposing the processed values for the title, keywords, etc. as well as any other piece of data we may consider useful. So the user can then have a completely custom mechanism for what they do after a note is created.

-- Protesilaos Stavrou https://protesilaos.com