Open bdarcus opened 2 years ago
I think it be preferable to have a similar function for CLS-JSON, not combine both into a single function, but yeah, doing it for both formats makes sense.
I've now pushed updates to both parsebib and Ebib that move ebib-TeX-markup-replace-alist
and related functions to parsebib.
parsebib-parse-bib-buffer
now has an extra argument replace-TeX
: if non-nil
, all replacements in parsebib-TeX-markup-replace-alist
are applied to field values. parsebib-parse
applies these as well (unless you call it with its display
argument set to nil
).
Note that right now, if you set replace-TeX
to t
, all field values are passed through parsebib-clean-TeX-markup
. This seems the right thing to do (there may be TeX markup in the journaltitle
field, for example, or in author
or editor
fields), but let me know if that should be more flexible.
@Hugo-Heagren I haven't added a new user option to disable replacing TeX markup, as I suggested in https://github.com/joostkremers/parsebib/pull/21#issuecomment-1153590907, because I realised that you can customize ebib-field-transformation-functions
if you want to disable it.
Many thanks, @joostkremers!
I will take a look, and integrate this.
On your question about flexibility, I'm not sure, so let ask:
Here's a current defcustom, which is similar to ebib-field-transformation-functions
:
(defcustom citar-display-transform-functions
'((t . citar-clean-string)
(("author" "editor") . citar-shorten-names))
"Configure transformation of field display values from raw values.
All functions that match a particular field are run in order."
:group 'citar
:type '(alist :key-type (choice (const t) (repeat string))
:value-type function))
So this says first clean the string (as with this function) regardless.
And the second says to run citar-shorten-names
on "author" or "editor" fields only.
Per this issue, we need to swap that order, since we need to preserve organizational authors and such..
https://github.com/emacs-citar/citar/issues/532
But beyond that, with this change, WDY recommend?
Actually, nevermind. When I find some time, I'll integrate, and let you know if I run into any issues.
On your question about flexibility, I'm not sure, so let ask:
Here's a current defcustom, which is similar to
ebib-field-transformation-functions
:
With flexibility I meant that perhaps you might want to be able to specify which fields parsebib-clean-TeX-markup
should be to be applied to. But your citar-display-transform-functions
is more general, because it's not limited to a single function.
It would actually make sense to build that into parsebib, I think. The idea would be that you can then pass the value of citar-display-transform-functions
to parsebib-parse
and parsebib would do the rest. Since both packages are loaded, it wouldn't matter if the functions to be applied would be from parsebib (parsebib-clean-TeX-markup
) of from citar (citar-shorten-names
).
We'd just have to think about what to do with bib(la)tex vs. CSL-JSON. It would make sense to generalise the transformations that parsebib already does for CSL-JSON in the same way, but we'd probably need to keep the two types separate. At best it would be a waste of CPU cycles to apply transformations for bibtex to CSL-JSON data and vice versa, at worst it would wreak havoc.
It would actually make sense to build that into parsebib, I think.
That's indeed what I was wondering.
@bdarcus Does citar support both biblatex and CSL-JSON? Do you have transform functions for the latter?
@bdarcus Does citar support both biblatex and CSL-JSON?
Yes; it's among the reasons I dropped the bibtex-completion deoendency and use parsebib directly.
Do you have transform functions for the latter?
Not ATM.
The only transform function I have, other than to strip TeX markup, is the shorten-names one, which isn't very general, but seems to work with both formats, without modification.
(defun citar-shorten-names (names)
"Return a list of family names from a list of full NAMES.
To better accommodate corporate names, this will only shorten
personal names of the form 'family, given'."
(when (stringp names)
(mapconcat
(lambda (name)
(if (eq 1 (length name))
(cdr (split-string name " "))
(car (split-string name ", "))))
(split-string names " and ") ", ")))
EDIT: but it doesn't currently have logic to handle corporate names; e.g. those in brackets.
If something like this would be valuable in parsebib, feel free to adapt it as you like.
Thinking a bit more, maybe there could be format independent transformation functions that call to format specific ones?
Like parsebib-shorten-names
vs parsebib--shorten-names-tex
.
Follow-up to:
https://github.com/bdarcus/citar/issues/535
The function doesn't reference any other ebib functions, but it does rely on
ebib-TeX-markup-replace-alist
, so I assume that would need to be moved as well.But it seems a straightforward move-and-rename.
Alas, I'm not familiar enough with this codebase to know how best to then integrate it here.
Should probably ~be expanded to do~ also at some point add a parallel function that does the same for CSL JSON markup, though the use of markup there isn't really standardized ATM.