Closed bdarcus closed 2 years ago
JSYK, The ebib version of this function and associated variables changed a bit recently -- you might prefer to copy/alter that version (or you might not :shrug:).
OMG; that was a total oversight on my part; I wasn't paying attention to getting the current version.
Seems it changed more than "a bit" though!!
Edit: I guess a lot of that was test code though.
I'll let @joostkremers weigh on on what he wants to do.
I'll let @joostkremers weigh on on what he wants to do.
Heh, good question. Ideally, this stuff should only be implemented once, of course. So if we want parsebib
to be able to make use of it, it should be in parsebib
. I don't think there's any reason not to have it there, so in principle I'm not against it.
AFAICT it would be possible to simply move ebib-clean-TeX-markup
and related code to parsebib
, so that in Ebib we'd only need to change the function calls and not much else.
The question is just what to do with ebib-TeX-markup-replace-alist
, which is a user option.
Normally, I'd want to keep it a user option, which means duplicating it in Ebib and parsebib
and having Ebib pass its value to parsebib
. This is what I did with ebib-biblatex-inheritances
when I moved that functionality into parsebib
.
However, in this particular case, perhaps we don't need to do so. I mean, even though ebib-TeX-markup-replace-alist
is technically a user option, it's such a long and unwieldy list that I doubt any normal user will try and customise it.
So I would move ebib-TeX-markup-replace-alist
to parsebib
as well and turn it into a defvar
. A user that really wants to customise it can always redefine the variable in their init file.
I would like to have some customisability, though, so perhaps we can introduce an option ebib-clean-TeX-markup
(or some similar name), which for now is just a toggle. The (new) function parsebib-clean-TeX-markup
would then have to take an additional optional argument indicating whether it should actually do its magic or not. (Eventually, I'd like to turn this option into a list of replacement types, which would roughly correspond to the different groups indicated in the definition of ebib-TeX-markup-replacement-alist
, but that can wait.)
Another question is the integration of this new parsebib-clean-TeX-markup
into parsebib
itself. It seems to me that parsebib-parse-bib-buffer
should have an additional keyword argument to indicate whether it should clean TeX markup in field values and that parsebib-parse
should set this argument to the value of its own display
argument.
The actual clean-up should be done in parsebib--parse-bib-value
, if I'm not mistaken, which means that it should also get an additional optional argument, and so should parsebib-read-entry
and parsebib--parse-bibtex-field
, because they need to pass the toggle on to parsebib--parse-bib-value
.
Another question is the integration of this new
parsebib-clean-TeX-markup
intoparsebib
itself. It seems to me thatparsebib-parse-bib-buffer
should have an additional keyword argument to indicate whether it should clean TeX markup in field values and thatparsebib-parse
should set this argument to the value of its owndisplay
argument.The actual clean-up should be done in
parsebib--parse-bib-value
, if I'm not mistaken, which means that it should also get an additional optional argument, and so shouldparsebib-read-entry
andparsebib--parse-bibtex-field
, because they need to pass the toggle on toparsebib--parse-bib-value
.
I support @bdarcus original motivation for including this function in parsebib (as opposed to ebib). I'm less sure about the stuff quoted above. Parsebib is a parsing library -- it's principle function is to get information written in a structured file and return it in a structured lisp object. And the information in the file has the markup in it, not the cleaned up text. joostkremers/ebib#238 and joostkremers/ebib#248 were intended to make presenting that data prettier. I'm worried that including cosmetic functionality in the parsing functions might might make it harder to access downstream.
This is a long winded way of saying that if there's a boolean option to clean up when parsing, it should be off by default.
I support @bdarcus original motivation for including this function in parsebib (as opposed to ebib). I'm less sure about the stuff quoted above. Parsebib is a parsing library -- it's principle function is to get information written in a structured file and return it in a structured lisp object. And the information in the file has the markup in it, not the cleaned up text. joostkremers/ebib#238 and joostkremers/ebib#248 were intended to make presenting that data prettier. I'm worried that including cosmetic functionality in the parsing functions might might make it harder to access downstream.
But parsebib already includes functionality to make the data more presentable: Returning entries for display. This is basically just an extension of that.
Ebib doesn't use any of this cleanup functionality, because it needs the exact contents of the .bib
file to work with. Other packages that use parsebib don't, though, they just present the contents of a .bib
file to the user. For such packages, it's more convenient to be able to pass an argument to parsebib-parse-bib-buffer
and get nice display-ready results.
This is a long winded way of saying that if there's a boolean option to clean up when parsing, it should be off by default.
Well, it's up to the caller to decide which kind of cleanup parsebib should do, if any. Obviously, when Ebib opens a .bib
file, it shouldn't use this option.
But parsebib already includes functionality to make the data more presentable: Returning entries for display. This is basically just an extension of that.
Exactly.
For context, here's the original citar issue that led to this https://github.com/emacs-citar/citar/issues/535.
The sequence is:
So it naturally seems the best place to put this code here, so that each project should rely on the same function, rather than reimplement.
But as you can tell, I didn't realize the additional complexity :-)
PS - I should also note https://github.com/emacs-citar/citar/issues/532, on name-parsing and formatting. Not sure where that would have to happen to work correctly, but probably here.
BTW, I find parsebib really useful, since I don't have to worry about low-level formatting details, and just call fields to display.
Also, we currently have a citar-display-transform-functions
defcustom, which allows one to easily configure which string manipulation functions are run against which fields. I guess something like that could be here, or we could just use this function as one of them.
But as you can tell, I didn't realize the additional complexity :-)
I don't think it's that complex. I started implementing it, should have something testable in a few days.
PS - I should also note emacs-citar/citar#532, on name-parsing and formatting. Not sure where that would have to happen to work correctly, but probably here.
It would at least be in line with the "prettify for display" functionality that parsebib already has. For JSON name fields there is already something similar: https://github.com/joostkremers/parsebib#parsebib-json-name-field-template.
I don't think it's that complex.
I just mean for someone not familiar with the codebase :-)
I started implementing it, should have something testable tomorrow.
Great! In that case, I'll close this.
It would at least be in line with the "prettify for display" functionality that parsebib already has. For JSON name fields there is already something similar: https://github.com/joostkremers/parsebib#parsebib-json-name-field-template.
A difference, however, is CSL JSON name representation is a structured object, while bibtex is a string.
... along with the associated parsebib-TeX-markup-replace-alist, though converted to a defvar, since this isn't a user-facing package.
Close #19
I didn't touch ebib; figure you can remove that independently, as you wish.