bastibe / org-journal

A simple org-mode based journaling mode
BSD 3-Clause "New" or "Revised" License
1.24k stars 124 forks source link

org-journal-list-dates fails if org-journal-file-format uses subdirectories #116

Closed TinaRussell closed 6 years ago

TinaRussell commented 6 years ago

I’ve realized it can be useful to use subdirectories in the org-journal-file-format option, like this: %Y/%m/%Y-%m-%d (for “year/month/year-month-day”) This way, journal entries won’t be all in one flat directory. It mostly works, but the main hiccup is here:

(defun org-journal-list-dates ()
  "Loads the list of files in the journal directory, and converts
  it into a list of calendar DATE elements"
  (org-journal-dir-check-or-create)
  (mapcar #'org-journal-file-name->calendar-date
          (directory-files org-journal-dir nil org-journal-file-pattern nil)))

directory-files only retrieves files in the base directory—it’s not recursive. org-journal-list-dates is used for things like marking journal entries in the calendar, and carrying over TODO items, so at present those don’t work if you have journal entries in subdirectories. I hoped to fix this by changing directory-files to directory-files-recursively (and removing the two nil arguments, which would no longer apply), but that still returns nil for me because directory-files-recursively only checks the regexp (generated from org-journal-file-path) against base file names, not full file names. So, directory-files-recursively will see a file name like 2018-06-23, check it against a regexp that’s looking for something like 2018/06/2018-06-23, and so it ignores the file and moves on. I’m not sure what the best way would be to fix this, so I’m documenting it here.

bastibe commented 6 years ago

If I understand this correctly, directory-files-recursively does the right thing, but org-journal-file-name->calendar-date only looks at the file name, right? So we would probably need to modify org-journal-file-name->calendar-date as well, and have it look at more than the file name if the pattern contains a /, or something like that.

Would you like to take a stab at this and create a pull request? I'll help you if you have questions.

TinaRussell commented 6 years ago

Thanks! I decided the simplest way would probably be to have org-journal-list-dates check for a path separator in org-journal-files-pattern, and if present, do a different routine that gets the file list from directory-files-recursive and then check it against the regexp. (This is because directory-files-recursive returns a list of absolute file names, but checks the regexp against relative file names, and there doesn’t seem to be a simple way of changing that behavior.) I got pretty far in this:

(defun org-journal-list-dates ()
  "Loads the list of files in the journal directory, and converts
  it into a list of calendar DATE elements"
  (org-journal-dir-check-or-create)
  ;; check if org-journal-file-pattern contains a path separator.
  (if (if (or (eq system-type 'ms-dos) (eq system-type 'windows-nt))
          ;; if we’re on DOS or Windows, check for two potential path
          ;; separators: the Microsoft-standard backslash, or the
          ;; Unix standard forward slash.
          (or (seq-contains org-journal-file-pattern 92)  ; 92 is \
              (seq-contains org-journal-file-pattern 47)) ; 47 is /
          ;; otherwise, just look for a forward slash
          (seq-contains org-journal-file-pattern 47))
      ;; grab the file list. We can’t use directory-files-recursively’s
      ;; regexp facility to filter it, because that only checks the
      ;; regexp against the files' relative names, and we need to check
      ;; the regexp against the absolute file names.
      (mapcar #'org-journal-file-name->calendar-date
              (directory-files-recursively org-journal-dir "\.*"))
      ;; if org-journal-file-pattern has no path separator, use the
      ;; old, simpler method of gathering the journal files
      (mapcar #'org-journal-file-name->calendar-date
              (directory-files org-journal-dir nil org-journal-file-pattern nil))))

The regexp argument to directory-files-recursively seems to be required, so I figured I’d put a regexp there (".*") that matches everything, and then enclose the whole directory-files-recursively call in a function that filters out items from a list based on a regexp. …But, I realized I don’t actually know how to do that. The usual seq-filter and cl-remove-if/cl-remove-if-not functions seem to require being sent a predicate function that takes a single argument, so predicates that test for a regexp are out, it seems (unless there’s some kinda special syntax to do this). (I suppose I could make a lambda function or some kind of locally-defined function to use as a predicate, but I don’t know what would be the best way to do this or how it would affect performance.) My brain is tired, now :wink:

TinaRussell commented 6 years ago

This seems to work:

(defun org-journal-list-dates ()
  "Loads the list of files in the journal directory, and converts
  it into a list of calendar DATE elements"
  (org-journal-dir-check-or-create)
  ;; check if org-journal-file-pattern contains a path separator.
  (if (if (or (eq system-type 'ms-dos) (eq system-type 'windows-nt))
          ;; if we’re on DOS or Windows, check for two potential path
          ;; separators: the Microsoft-standard backslash, or the
          ;; Unix standard forward slash.
          (or (seq-contains org-journal-file-pattern 92)  ; 92 is \
              (seq-contains org-journal-file-pattern 47)) ; 47 is /
        ;; otherwise, just look for a forward slash
        (seq-contains org-journal-file-pattern 47))
      ;; grab the file list. We can’t use directory-files-recursively’s
      ;; regexp facility to filter it, because that only checks the
      ;; regexp against the files' relative names, and we need to check
      ;; the regexp against the absolute file names.
      (mapcar #'org-journal-file-name->calendar-date
              (seq-filter (lambda (file-path)
                            (string-match-p org-journal-file-pattern
                                            (file-relative-name
                                             file-path
                                             org-journal-dir)))
                          (directory-files-recursively org-journal-dir "\.*")))
    ;; if org-journal-file-pattern has no path separator, use the
    ;; old, simpler method of gathering the journal files
    (mapcar #'org-journal-file-name->calendar-date
            (directory-files org-journal-dir nil org-journal-file-pattern nil))))

I’ve made the commit on my branch “allow-journal-subdirs.” Let me know if you want anything different before I make a pull-request.

bastibe commented 6 years ago

Cool! This will be a useful feature!

A few things:

TinaRussell commented 6 years ago

Well, the idea with the directory separator was that it’s testing whether or not org-journal-file-pattern contains a path separator—I figured that even if Elisp functions return Unix-style paths, a Windows user might use backslashes when setting org-journal-file-format (and thus org-journal-file-pattern). But, it’s a moot point anyway, if you think it would be better just to use my external filtering mechanism outright.

I get your comments about code readability! I think I was trying to grasp the underlying Zen of functional programming, nesting functions into functions forever, so it’s a good reminder that doctrinal purity needs to bend now and then to the needs of on-screen practicality. (And, everything was getting crazy indented up to the right margin…)

bastibe commented 6 years ago

a Windows user might use backslashes when setting org-journal-file-format

True, I hadn't though of that.

However, we would need to document that the pattern needs to use forward slashes instead of backslashes (or we just swap them out automatically).

I, too, have gone back and forth between functional purity and a more procedural style of programming. But at the end of the day, I just find those deeply-nested knots of parenthesis hard to parse. It helps to abstract stuff out into separate functions, but that's just a let in disguise, too. For elisp, I have gone back to a rather procedural flow, with variable declarations at the top, and frequent setq instead of recursive functions.

TinaRussell commented 6 years ago

How’s this?

(defun org-journal-list-dates ()
  "Loads the list of files in the journal directory, and converts
  it into a list of calendar DATE elements"
  (org-journal-dir-check-or-create)
  ;; grab the file list. We can’t use directory-files-recursively’s
  ;; regexp facility to filter it, because that only checks the
  ;; regexp against the base filenames, and we need to check it
  ;; against filenames relative to org-journal-dir.
  (let ((file-list (directory-files-recursively org-journal-dir "\.*"))
        (predicate (lambda (file-path)
                     (string-match-p
                      org-journal-file-pattern
                      (file-relative-name file-path org-journal-dir)))))
    (mapcar #'org-journal-file-name->calendar-date
            (seq-filter predicate file-list))))
TinaRussell commented 6 years ago

Okay, realized there’s a problem there: it’s passing absolute file names to org-journal-file-name->calendar-date, which expects relative file names. Because of that, it returns a long list of lists that are nothing but three zeroes each. So, maybe it should convert all the file names to be relative to org-journal-dir, then filter out the ones that don’t match org-journal-files-pattern, and then pass the resulting list to org-journal-file-name->calendar-date. I’ll have to figure this out later. Also, I went through the code and changed a bunch of instances of something like (file-name-nondirectory FILE-NAME-VAR) to something like (file-relative-name FILE-NAME-VAR org-journal-dir), in every instance where org-journal needs to use a filename to extract a date. That way it will work with subdirectories.

bastibe commented 6 years ago

Very nice! If you want to, you can open a pull request, and we can continue the discussion there. That way, I will be able to see your changes as you make them!

TinaRussell commented 6 years ago

Okay! BTW, I think I’ve fixed it:

(defun org-journal-list-dates ()
  "Loads the list of files in the journal directory, and converts
  it into a list of calendar DATE elements"
  (org-journal-dir-check-or-create)
  ;; grab the file list. We can’t use directory-files-recursively’s
  ;; regexp facility to filter it, because that only checks the
  ;; regexp against the base filenames, and we need to check it
  ;; against filenames relative to org-journal-dir.
  (let ((file-list (directory-files-recursively org-journal-dir "\.*"))
        (get-relative-paths (lambda (file-path)
                              (file-relative-name file-path org-journal-dir)))
        (predicate (lambda (file-path)
                     (string-match-p org-journal-file-pattern file-path))))
    (mapcar #'org-journal-file-name->calendar-date
            (seq-filter predicate (seq-map get-relative-paths file-list)))))
bastibe commented 6 years ago

Wow, this is really neat! Thank you for taking the time to do this, and find a really nice solution!

In your pull request, I noticed that we are apparently doing (org-journal-file-name->calendar-date (file-relative-name FILENAME org-journal-dir)) like half a dozen times. It might be useful to wrap this construct in its own function.

Would you like to do this? Otherwise, I'd be happy to merge your pull request as-is as well!

TinaRussell commented 6 years ago

I think we have three options:

  1. Create such a wrapper function (call it something like org-journal-file-name-full->calendar-date)
  2. Modify org-journal-file-name->calendar-date such that it always takes full filenames
  3. Modify org-journal-format-string->regex such that it (and thus org-journal-file-pattern) always provides a regex that can be tested against absolute file names.

I think option 3 is best, since it would simplify a lot of the code. Of course, we’d have to make sure that the code to generate org-journal-file-pattern is run whenever the user sets org-journal-dir (like it is with org-journal-file-format), and also when updating from an older version. Not sure how/if that last one is possible, I don’t really know anything about Emacs Lisp packaging.

bastibe commented 6 years ago

I hadn't though of that! But option 3 sounds really good.

The way it works is that line 220 simply sets org-journal-file-pattern at startup. It is not a perfect solution, since it is possible to modify org-journal-file-format at runtime without executing its associated setter, but it's the least bad solution I could find. Anyway, all of this should work without modification with a modified org-journal-format-string->regex.

TinaRussell commented 6 years ago

Okay, I’ve made some more changes.

  1. org-journal-file-pattern now matches against absolute file names. Because of this, setting org-journal-dir via Customize will now update org-journal-file-pattern.
  2. I realized there were places I missed in the code (such as in the org-agenda stuff) where it uses directory-files to retrieve files from org-journal-dir, with the assumption that they are all in the base directory; this has been fixed, by abstracting the finding/filtering mechanism in org-journal-list-dates out into the new function org-journal-list-files, which is used everywhere that a list of journal files is needed.
  3. The construct (file-truename (expand-file-name (file-name-as-directory org-journal-dir))) was part of code I removed from org-journal-update-auto-mode-alist as no longer needed. But, it seemed important, so it’s used by org-journal-format-string->regex and org-journal-list-files when they need to interpret org-journal-dir.
  4. (org-journal-file-name->calendar-date (file-relative-name FILENAME org-journal-dir)) is now (org-journal-file-name->calendar-date FILENAME), everywhere.

I’ve checked it over thoroughly and it all seems to work, though I suppose it should be tested by someone who uses the org-agenda functionality before all this ends up in a stable version.

One last quandary for the sake of coding style: when I changed the setter for org-journal-dir to add (setq org-journal-file-pattern (org-journal-format-string->regex org-journal-file-format)), I realized it’s a little weird. The org-journal-format-string->regex function is only used to set org-journal-file-pattern, with that exact construct, every time… except in the setter for org-journal-file-format, where the argument is value (i.e. the value just assigned to org-journal-file-pattern). Surely either it should accept two variables, which would inevitably be org-journal-dir (or value, for that variable’s setter) and org-journal-file-format (or value, for that variable’s setter)—I guess that would be the “Common Lisp” way; or, the whole function should be replaced with something like org-journal-update-file-pattern which takes no arguments, and sets org-journal-file-pattern according to the values of org-journal-dir and org-journal-file-format. Heh, I feel like a l33t programmer now :princess:

bastibe commented 6 years ago

One last quandary for the sake of coding style

Yes, I agree! The emacs way is probably to have the function have no arguments at all--much as I usually dislike this style of coding, it has some merits in a dynamically-bound language such as elisp. Have you looked into advice in elisp? There is nothing quite like it in other programming languages, particularly because you can advice around a function with a local let-binding that then overrides the global constants for this one function. It is a super power if I ever saw one. Stuff like this really makes me love this little language, with all its quirks!

TinaRussell commented 6 years ago

Okay, made the change! Yep, I know about advice. What really drew me to Emacs is the ability to muck around with the internals like that! I’m picky about how my software functions.

TinaRussell commented 6 years ago

Uh oh, don’t merge it yet—I’m getting errors saying that org-journal-file-format is void (even though it shouldn’t be, since it’s defined in the source and in my init.el), and this error persists unless I evaluate the relevant defcustom explicitly. Weird…

TinaRussell commented 6 years ago

Hmmm, I’m trying to figure out how that bug happened. My guess is that it has something to do with either the order of the definitions, or which definitions have the magic autoload comment; what’s the rhyme/reason behind which definitions come first and which have the autoload comment? (This assumes I’m not barking up the wrong tree, of course!)

TinaRussell commented 6 years ago

I think the problem is solved, now; the problem was, as far as I can tell, that the setters for org-journal-dir and org-journal-file-format each use the other variable’s value, which is a problem when the other variable hasn’t been bound yet. So, I put in some let statements to assign a default value when the required variable is unbound. I’ve pushed the commit.

bastibe commented 6 years ago

This is awesome! Thank you very much for this contribution!

And beyond that, thank you very much for caring about the code and how it looks and works. You truly made org-journal a more approachable code base, all the while adding features. That is no mean feat!

TinaRussell commented 6 years ago

:princess: Awww, thanks! You’re very welcome!