emacs-citar / citar

Emacs package to quickly find and act on bibliographic references, and edit org, markdown, and latex academic documents.
GNU General Public License v3.0
501 stars 54 forks source link

file indicator incorrectly indicating #776

Open stefangroha opened 1 year ago

stefangroha commented 1 year ago

Describe the bug I use Spacemacs and Mendeley and I am trying to directly access the pdfs of files through citar, however when I try to set

(setq citar-file-parser-functions
  '(citar-file-parser-default
    citar-file-parser-triplet))

as described in the documentation, I get the following error:

citar--check-configuration: ‘citar-file-parser-functions’ should be a list of functions: '(citar-file-parser-default citar-file-parser-triplet)

If I remove this snippet I do not see the link to the file when clicking on citar links, however when trying to insert links it does show me that a file is there.

To Reproduce Steps to reproduce the behavior: My citar configuration is (except for the above snippet) is

;; citar
(use-package citar
    :no-require
    :custom
        (org-cite-global-bibliography '("~/library.bib"))
        (org-cite-insert-processor 'citar)
        (org-cite-follow-processor 'citar)
        (org-cite-activate-processor 'citar)
        (citar-bibliography org-cite-global-bibliography)
    :bind
         (:map org-mode-map :package org ("C-c b" . #'org-cite-insert)))

(use-package citar-org-roam
        :after (citar org-roam)
        :config (citar-org-roam-mode))

(use-package citar-embark
       :after citar embark
       :no-require
       :config (citar-embark-mode))
  1. insert citation
  2. click citation
  3. error message

Expected behavior A buffer should open with the choice for weblinks, notes and direct links.

Emacs version: 28.2

bdarcus commented 1 year ago

Something is weird. What you're trying to set is the default value?

stefangroha commented 1 year ago

Maybe I am confused by the documentation, I want to mainly set the triplet value, but I just copy and pasted this in case I want to use zotero as well.

bdarcus commented 1 year ago

You shouldn't have to set that; I never do, and just use the default.

The way the parsing code works is it runs through the list of functions until it gets a result. So there's no harm in having both default.

I mostly use Doom, but have a small experimental init over here, where it should all work correctly (see init-biblio.el in particular).

https://gitlab.com/bdarcus/emacs-elp

stefangroha commented 1 year ago

Without setting the variable I can see that there is a file associated with the bibliography (when trying to insert), however when clicking the citation I only see the weblink and option to create a note and no option to open the pdf. As the file field in the bib-file is set in Mendeley format (:/path/to/pdf:pdf), I thought this part of the documentation would help.

bdarcus commented 1 year ago

Let's start from the beginning ...

OK, I see a problem now :-)

So the parser code and at-point functionality is probably confusing the issue.

This is what I see:

If I do M-x citar-open, and then filter the candidates to those that should have files associated with them, I see:

image

If I select the first, I see this:

image

I think that's what you're seeing also?

So somehow a bug crept in somewhere.

Two possibilities, which I guess aren't mutually exclusive:

  1. the indicators are wrong, incorrectly indicating there are files, when there are not
  2. the functions that list the files in the second step is wrong, incorrectly omitting them

You are saying it's 2, but we'd want to confirm it is, and it's not 1, because it's different code.

bdarcus commented 1 year ago

Looking at the biblatex file for that entry, it has this as the value of file:

/home/bruce/Zotero/storage/WZB4FH9E/1405ba54-7f4e-11ea-8013-1b6da0e4a2b7_story.html

The reason that file is not showing up is because the file doesn't actually exist (long story)! So the problem there is 1 above.

But now I need to figure out why the indicator is wrong.

stefangroha commented 1 year ago

That is exactly what I am seeing, however for me in the biblatex file I do see a link to a pdf:

file = {:Users/smg/Library/Application Support/Mendeley Desktop/Downloaded/Heinz et al. - 2022 - Liver Colonization by Colorectal Cancer Metastases Requires YAP-Controlled Plasticity at the Micrometastatic Stage.pdf:pdf}

which indeed exists at this location.

bdarcus commented 1 year ago

Can you confirm the file is actually there, though, with that precise path? The code won't list it if it can't find it.

Not sure what OS you're on (Mac OS?), but this is what I see on Linux from the console:

❯ ls /home/bruce/Zotero/storage/WZB4FH9E/1405ba54-7f4e-11ea-8013-1b6da0e4a2b7_story.html
ls: cannot access '/home/bruce/Zotero/storage/WZB4FH9E/1405ba54-7f4e-11ea-8013-1b6da0e4a2b7_story.html': No such file or directory
stefangroha commented 1 year ago

Ah, thank you so much, the file path from the Mendeley export did not have the initial "/' (before Users), which was of course needed. It does work now, if I change the bib file to the correct link.

bdarcus commented 1 year ago

I'm going to reopen this, then, as in that case, the UI should not indicate it as present.

bdarcus commented 1 year ago

@roshanshariff - before I dig into fixing this, anything I should know?

Basically, right now, citar-has-files (well, really citar-file--has-file-field) does not check if a file exists, but the open commands do.

So one can get an the indicator for an entry say it has a file, but it actually doesn't.

Is that perhaps for performance reasons, or is this an actual bug/oversight?

I can modify that function something like this:

  (when citar-file-variable
    (lambda (citekey)
      (and (citar-file--parse-file-field
            (or (citar-get-value citar-file-variable citekey) "") nil) t))))

... but that would also require changes in citar-file--parse-file-field.

roshanshariff commented 1 year ago

@bdarcus, this was an intentional change for performance reasons, so that showing the indicator doesn't require parsing the file fields, checking if files exist, and so on. Since this function is called for every bibliography item every time you call any citar command, that would be prohibitively slow.

Instead, it just checks if each entry has a file field. I remember that we discussed this in the relevant PR, but can't find the link right now. I think we also agreed that it's a reasonable user experience, since the item is "supposed" to have an attached file, and if you get an error when you open it you can investigate why it's not there.

bdarcus commented 1 year ago

I thought that might be the case; sounds familiar.

Except the file isn't listed in the related resources UI, so you don't get the error; just a discrepancy between the two UI views.

So maybe we should list them even if in error?

And do not the files in directories get checked?

roshanshariff commented 1 year ago

So maybe we should list them even if in error?

Yes, I think all the files in the file field should be listed even if they don't exist. This is going to take some thinking, because currently the file field parser functions use the non-existence of the file to filter out bad parses (e.g. treating a triplet-style file field as a plain filename). But perhaps we can refine the parsing functions so we don't have to do this.

And do not the files in directories get checked?

Yes, for files that are found in the library path (i.e. not listed in the file field in a bibliography) the indicator does reflect whether the file actually exists.

bdarcus commented 1 year ago

Yes, for files that are found in the library path (i.e. not listed in the file field in a bibliography) the indicator does reflect whether the file actually exists.

Is that faster than splitting a string and checking the path(s)?

I guess the confusing part is also some files are checked differently than other files; that the indicator means different things depending.

roshanshariff commented 1 year ago

The difference is that, for library files, we can just scan the directory once and figure out which keys have files, because each filename starts with the key. Not having a file for a key is not an error; it just means you don't have an attached file. On the other hand, explicitly listing a non-existent file in the bibliography seems like a data error. Perhaps we could handle it by having a linting function that checks for non-existent files and offers to remove them, which you would run when needed?

From a performance point of view, the files listed in file fields needn't be in any particular location, so we can't scan a directory beforehand. So we take the existence of a file field as a signal that the item has attached files.

bdarcus commented 1 year ago

On the other hand, explicitly listing a non-existent file in the bibliography seems like a data error. Perhaps we could handle it by having a linting function that checks for non-existent files and offers to remove them, which you would run when needed?

Makes sense.

I was also wondering about using faces to distinguish files that are confirmed to exist, and those that are confirmed to not exist, or unconfirmed?

But maybe that gets too complicated.

roshanshariff commented 1 year ago

If I remember correctly, annotation and affixation functions are only called for visible candidates, not all candidates. So you could do (relatively) expensive work in those functions. But they can only change the visible representation of each candidate (like icons and other prefixes/suffixes, but not faces). They can't modify the actual candidate text like the searchable has:files tag.

And you're right, doing all this adds complexity which we'd have to weigh against the usefulness of the feature.

bdarcus commented 1 year ago

Yes, that also makes sense. Annotations or affixations might be appropriate for that case.