emacs-citar / citar

Emacs package to quickly find and act on bibliographic references, and edit org, markdown, and latex academic documents.
GNU General Public License v3.0
479 stars 53 forks source link

Citar fails to display files containing commas #819

Open wenzlawski opened 4 months ago

wenzlawski commented 4 months ago

Describe the bug Citar fails to find files containing commas when displaying in completing read. Strangely everything works fine when there is only one file associated with an entry. However when there are two files present and one has to select one to display it shows the following error:

None of the files for ‘MarkFisher1272’ exist; check ‘citar-library-paths’ and ‘citar-file-parser-functions’

To Reproduce Steps to reproduce the behavior:

  1. Make a .bib file with an entry containing two files with commas in their filename
  2. Load the .bib file in citar
  3. Call citar-open and select the entry
  4. None of the files appear

Alternatively, call citar-open-files, select the entry and the above error message appears.

Emacs version: 29.2

bdarcus commented 4 months ago

Been awhile since I've looked at that code, but it's tricky to generalize file path parsing of this sort.

Edit: a quick test in ielm shows:

ELISP> (setq my/f1 "one,two.md;three.md")
"one,two.md;three.md"
ELISP> (citar-file--parser-default my/f1)
("one,two.md" "three.md")
ELISP> (citar-file--parser-triplet my/f1)
nil

So that example would work as is.

Per Roshan, please share a bib fragment that fails.

roshanshariff commented 4 months ago

@wenzlawski, could you paste the bib entry that's causing you the issue?

wenzlawski commented 3 months ago

I failed to realize that it's not the standard format, but the one exported by Calibre catalog. Their export format is a bit strange and I don't think there is a way to change it. The format output from that command is the :/tmp/file1.pdf:PDF, :/tmp/file2.pdf:PDF

So when calling (citar-file--parser-triplet ":/tmp/pdf1.pdf:PDF, :/tmp/pdf2.pdf:PDF") the output is:

("/tmp/pdf1.pdf:PDF, :/tmp/pdf2.pdf" "/tmp/pdf1.pdf:PDF, :/tmp/pdf2.pdf" "/tmp/pdf1.pdf" "/tmp/pdf1.pdf" "/tmp/pdf2.pdf" "/tmp/pdf2.pdf")

But when calling (citar-file--parser-triplet ":/tmp/pdf,1.pdf:PDF, :/tmp/pdf,2.pdf:PDF") the output is:

("/tmp/pdf,1.pdf:PDF, :/tmp/pdf,2.pdf" "/tmp/pdf,1.pdf:PDF, :/tmp/pdf,2.pdf")
wenzlawski commented 3 months ago

@roshanshariff the following is a minimal failing bib. Remove the commas from the files and it works as expected.

@book{ testbib123,
    title = "test bib",
    file = ":/tmp/pdf,1.pdf:PDF, :/tmp/pdf,2.pdf:PDF"}
roshanshariff commented 3 months ago

This seems like a variant of #454. You should be able to escape the commas within the filenames by putting a backslash before them. Citar will split the file field at unescaped commas, and then replace the escape sequences in the filenames.