joostkremers / parsebib

Elisp library for reading .bib files
BSD 3-Clause "New" or "Revised" License
35 stars 9 forks source link

direct parsing of files, csl-json feedback #12

Closed bdarcus closed 3 years ago

bdarcus commented 3 years ago

Why is parsing restricted to the buffer?

Is it impractical to allow direct file parsing?

bdarcus commented 3 years ago

My impression is that we may need separate csl-completion and biblatex-completion modules that can be plugged in elsewhere.

Yeah, that's probably reasonable.

In the end, though, we need the front-ends to be able to easily support the three, without the user having to worry about it.

That's what I was hoping this code would allow.

Do you think that's feasible?

If not, it's not a huge deal. Citeproc-el recently got much fuller biblatex support, so I can recommend that to users for which bibtex along isn't sufficient.

No need to answer now; just writing this down while it occurs to me.

Interestingly, I realized recently that the open-entry function does correctly work with csl json files!

joostkremers commented 3 years ago

Any estimate of when you can get back to and merge this @joostkremers?

With org-cite now merged, would be great to get json support in bibtex-completion et al.

I could merge the wip/csl branch right now. I just don't know how well-tested it is. It parses my .bib files without problems, but since it's used by bibtex-completion, I'd like to be sure there are no immediate issues there (I don't use bibtex-completion myself). Has anyone here been using the wip/csl branch with bibtex-completion?

bdarcus commented 3 years ago

Has anyone here been using the wip/csl branch with bibtex-completion?

I just did basic testing now:

  1. loading a library
  2. running the "open" and "insert" commands against a candidate

Nothing unexpected so far.

Oh, and this seems to work well!

ELISP> (defvar my/bibtest (parsebib-parse (org-cite-list-bibliography-files)))
my/bibtest
ELISP> (gethash "low2001" my/bibtest)
(("subtitle" . "gated communities and the discourse of urban fear")
 ("title" . "The Edge and the Center")
 ("number" . "1")
 ("langid" . "english")
 ("urldate" . "2016-02-18")
 ("doi" . "10.1525/aa.2001.103.1.45")
 ("issn" . "1548-1433")
 ("pages" . "45--58")
 ("volume" . "103")
 ("journaltitle" . "American Anthropologist")
 ("date" . "2001-03-01")
 ("author" . "Low, Setha M.")
 ("shorttitle" . "The Edge and the Center")
 ("=type=" . "article")
 ("=key=" . "low2001"))

The convenience field mapping function we've discussed here could certainly be a separate enhancement.

joostkremers commented 3 years ago

Has anyone here been using the wip/csl branch with bibtex-completion?

I just did basic testing now:

Alright then, here goes nothing! :smile:

(I know, I should have tests...)

The convenience field mapping function we've discussed here could certainly be a separate enhancement.

Ah yes, there was that, too... (I recently started a new job, so I don't have much time to work on this right now... :disappointed: )

bdarcus commented 3 years ago

Ah yes, there was that, too... (I recently started a new job, so I don't have much time to work on this right now... )

So if I wanted to experiment with this, building on this, the simplest start is something like:

(defun bibtex-actions-get-value (item field &optional alt)
  "Return FIELD value for ITEM, with optional ALT formatting."
  ;; Do we want item or key here?
  (assoc-string field item 'case-fold))

... where then need

  1. some logic to test if the field is present, and if not look up alternates
  2. a way to get alternate rendering back (year from date, shortened authors, etc.)

Does that make sense?

Any tips?

Not sure the best way to do this, but here's an iteration:

(defun foo-get-value (item field)
  "Return biblatex FIELD value for ITEM."
  ;; Do we want item or key here?
  (or (assoc-string field item 'case-fold)
      (pcase field
        ("date"
         (or (assoc-string "issued" item 'case-fold)
             (assoc-string "year" item 'case-fold)))
        ("booktitle"
         (assoc-string "container-title" item 'case-fold))
        ("author"
         (assoc-string "editor" item 'case-fold)))))

Or maybe just have a flat list of mappings, like (though I can't get seq-filter to correctly extract the alternative fields from this):

(defvar foo-field-map
  '(("date" "year" "issued")
    ("author" "editor")
    ("booktitle" "container-title")))
joostkremers commented 3 years ago

@bdarcus That would be the general idea, yes. I would definitely use a variable to define the mappings, that makes it more flexible (and customisable, if so desired).

I'm not sure what problem you're having with seq-filter, but the one thing I keep forgetting about that function is that it returns the matching elements unchanged. (Instead, I expect it to first filter and then map...)

Note, though, that with your defvar, a simple alist-get looks like a better choice.

bdarcus commented 3 years ago

Not exactly right, but closer.

Idea is that biblatex is the best mapping format here, as bibtex is less rich, and csl json has fields which are not unique (notably "container-title").

(defvar parsebib-field-map
  '(("date" "year" "issued")
    ("author" "editor") ; this is actually for the ivy/helm-bibtex/bibtex-actions case
    ("booktitle" "container-title")
    ("journaltitle" "journal" "container-title")
    ("number" "issue")))

(defun parsebib-get-value (item field)
  "Return biblatex FIELD value for ITEM."
  (or (cdr (assoc-string field item 'case-fold))
      (catch 'success
        (dolist (fm (cdr (assoc field parsebib-field-map)))
          (when fm
            (throw 'success (cdr (assoc-string fm item 'case-fold))))))))
bdarcus commented 3 years ago

@joostkremers - did you bump the version number with these changes? Eg does 3.0 include them?

bdarcus commented 3 years ago

BTW, there was minor breakage with oc-csl, because it relied on parsebib-parse-buffer.

But Andras has already fixed it, and will be submitting a patch, along with other improvements.

joostkremers commented 3 years ago

@joostkremers - did you bump the version number with these changes? Eg does 3.0 include them?

Yes, 3.0 is the version that reads CSL-JSON.

bdarcus commented 3 years ago

@joostkremers I got this hooked up and running with bibtex-actions fairly easy. Here's the function I'm using to do the mapping:

(defun bibtex-actions-get-value (field item &optional _default)
  "Return biblatex FIELD value for ITEM."
  (or (cdr (assoc-string field item 'case-fold))
      (cl-loop for fname in (cdr (assoc field bibtex-actions-field-map))
               when (cdr (assoc-string fname item 'case-fold))
                         return (cdr (assoc-string fname item 'case-fold)))
      ""))

Last thing is how to get short author names (sans given names). Is that something I need to do myself (or maybe not worry about it)? I do see parsebib-json-name-field-template, but it a general option.

EDIT; I just added a simple shorten-names function.

joostkremers commented 3 years ago

Last thing is how to get short author names (sans given names). Is that something I need to do myself (or maybe not worry about it)? I do see parsebib-json-name-field-template, but it a general option.

EDIT; I just added a simple shorten-names function.

Yeah, that's the only option right now. You can modify parsebib-json-name-field-template and take out the {given} part, but that only works for CSL-JSON data that actually uses the family and given properties. If the name is just given literally, à la bib(la)tex, then it won't work.