joostkremers / parsebib

Elisp library for reading .bib files
BSD 3-Clause "New" or "Revised" License
35 stars 9 forks source link

direct parsing of files, csl-json feedback #12

Closed bdarcus closed 3 years ago

bdarcus commented 3 years ago

Why is parsing restricted to the buffer?

Is it impractical to allow direct file parsing?

joostkremers commented 3 years ago

Why is parsing restricted to the buffer?

Well, parsebib used to be part of Ebib. Since Ebib uses the low-level API (because I want to be able to report errors and continue parsing), I only spun off the buffer-parsing part.

Is it impractical to allow direct file parsing?

No, it would just involve writing a wrapper function that takes a file path, creates a temp buffer, inserts the file with insert-file-contents and then parses the buffer. It could even take multiple file paths and insert all files into a single buffer before parsing. (Using a single buffer would allow the function to resolve @Strings and crossrefs in one go. It would probably also involve changing the way parsebib deals with errors, BTW.)

bdarcus commented 3 years ago

OK, cool.

Consider this a "would be nice to add at some point if you get to it" (or if someone submits a PR) feature request then :-)

joostkremers commented 3 years ago

Consider this a "would be nice to add at some point if you get to it" (or if someone submits a PR) feature request then :-)

Done. :-)

joostkremers commented 3 years ago

The wip/csl branch now has such a function, parsebib-parse. It takes a file name or a list of files and returns all entries in them in a single hash table.

It accepts a mix of .bib and .json files and is mainly meant for packages that want to display the contents of the entries to an end user, such as bibtex-completion and the packages based on it. It doesn't return the @Preamble or @Comments in a .bib file, but I assume that's not your use case anyway. (If it is, it can easily be added, though.)

bdarcus commented 3 years ago

Awesome!

I'll copy @tmalsburg, as it seems this could simplify the bibtex-completion code and also add the CSL json import.

tmalsburg commented 3 years ago

I'll copy @tmalsburg, as it seems this could simplify the bibtex-completion code and also add the CSL json import.

Thanks. Joost an I have already discussed how this could be used. Indeed there's a great potential for simplification in bibtex-completion.

bdarcus commented 3 years ago

I just tested a json file and bib file together.

How are you thinking about dealing with the different key names in csl-json vs bib(la)tex @joostkremers?

Aside from the issue you've already raised about strings vs symbols, the other obvious one is things like "journal" vs "container-title".

Just say that's not parsebib's responsibility (which would be totally reasonable)?

And what have you decided about the strings vs symbols issue?

bdarcus commented 3 years ago

PS - I did just notice a bug; somehow tags got appended to the doi on the bib import:

 (("doi" . "10.1080/13602004.2019.1575022tagsme,student")

Is that possible? I don't think that's my error, as the bib file just has the doi.

joostkremers commented 3 years ago

I just tested a json file and bib file together.

How are you thinking about dealing with the different key names in csl-json vs bib(la)tex @joostkremers?

Which keys are we talking about exactly?

Aside from the issue you've already raise about strings vs symbols, the other obvious one is things like "journal" vs "container-title".

Just say that's not parsebib's responsibility (which would be totally reasonable)?

For the moment, I decided to blend it out, yes. It could eventually make sense to offer some sort of unified structure, but for me, supporting CSL-JSON in Ebib currently has priority.

Note also that any conversion that parsebib undertakes on the data slows down parsing. So in general it might be better to do something like:

(or (assoc-string "year" entry 'case-fold)
    (assoc-string 'issued entry))

Although that might start to get ugly really quickly...

I am open to suggestions to handle it better. :slightly_smiling_face:

And what have you decided about the strings vs symbols issue?

Entries are represented as alists, and luckily enough Elisp has assoc-string, which also accepts symbols, but converts them to strings before comparing. It can also take a case-fold argument. So:

(assoc-string 'author '(("Author" . "Jane Doe") ("Title" . "Some Title")) 'case-fold)

returns ("Author" . "Jane Doe"), and similarly:

(assoc-string "Author" '((author . "Jane Doe") (title . "Some Title")) 'case-fold)

returns (author . "Jane Doe").

So while it would still be necessary to make sure you're asking for the right fields, at least you don't have to worry about passing in the correct type.

bdarcus commented 3 years ago

... luckily enough Elisp has assoc-string, which also accepts symbols, but converts them to strings before comparing. It can also take a case-fold argument ...

Great; that will go a long way.

I am open to suggestions to handle it better.

Edit: I could imagine a helper function (maybe an adapted bibtex-completion-get-value?) that wraps assoc-string and does the mapping when needed, so that one could do (bibtex-completion-get-value "author") or (bibtex-completion-get-value "issued") and it would return the right string regardless of source.

Or something similar in parsebib where, now that I think about, it seems to make more sense?

joostkremers commented 3 years ago

PS - I did just notice a bug; somehow tags got appended to the doi on the bib import:

 (("doi" . "10.1080/13602004.2019.1575022tagsme,student")

Is that possible? I don't think that's my error, as the bib file just has the doi.

That's weird. Are the tags anywhere in the bib file? Would it be possible to send me the file or the entry where this happens?

bdarcus commented 3 years ago

Would it be possible to send me the file or the entry where this happens?

Actually, in the process of narrowing this down, I realized there was a missing comma; syntax error.

So my fault ;-)

joostkremers commented 3 years ago

Actually, in the process of narrowing this down, I realized there was a missing comma; syntax error.

So my fault ;-)

:relieved:

joostkremers commented 3 years ago

Edit: I could imagine a helper function (maybe an adapted bibtex-completion-get-value?) that wraps assoc-string and does the mapping when needed, so that one could do (bibtex-completion-get-value "author") or (bibtex-completion-get-value "issued") and it would return the right string regardless of source.

Or something similar in parsebib where, now that I think about, it seems to make more sense?

I thought about this, but I haven't made up my mind yet.

There's a related question: Denis explained to me that Zotero uses it own set of fields, which are mapped internally to CSL-JSON fields. So the JSON fields are never visible to the user. A similar strategy might make sense for Ebib, because the JSON fields aren't always that descriptive (cf. container-title vs. jourrnal, which was already mentioned above).

So we'd end up with two mappings: one from biblatex fields to csl-json fields, to allow packages using parsebib to access field values without having to know the format of the underlying file, and another from UI fields to csl-json fields. Defining these in parsebib would make it easy to unify them across the Emacs eco system. Then again, I'm not sure if it would even be useful to have a set of UI fields for bibtex-completion.

BTW, if parsebib would define an access function to get the value of a field, one could even go a step further and have that function do @String expansion and cross-reference resolution as well. Then this wouldn't have to be done during parsing.

Here, too, though, I'm not sure if it would make sense for bibtex-completion. For Ebib, it would; in fact I already do this (at least for cross-references; @Strings aren't expanded).

bdarcus commented 3 years ago

Yeah, the CSL model is a bit more abstract, and also aimed at output formatting. Hence names like "container-title".

The bibtex-completion front-ends don't actually include any labels in the UIs, so the names are also less important there I think.

The exception is a user wanting to configure the display using the templates. There one uses the field names directly.

With that, there is arguably some advantage with, for example, "container-title" (because it effectively means "journal" or "incollection").

So we'd end up with two mappings: one from biblatex fields to csl-json fields, to allow packages using parsebib to access field values without having to know the format of the underlying file, and another from UI fields to csl-json fields. Defining these in parsebib would make it easy to unify them across the Emacs eco system. Then again, I'm not sure if it would even be useful to have a set of UI fields for bibtex-completion.

I guess I'd have to see the details to know for sure.

You would need such a UI mapping regardless for ebib, right? So just a question of where to put it?

If yes, you could start with it here and see what feedback you get?

Up to you, but do you want to open a new issue for this? You already closed this narrow request :-)

joostkremers commented 3 years ago

The exception is a user wanting to configure the display using the templates. There one uses the field names directly.

But I guess in that case you want to see the actual field names and you don't want someone's idea of a useful UI to get in the way, right?

I guess I'd have to see the details to know for sure. You would need such a UI mapping regardless for ebib, right?

Yes.

So just a question of where to put it? If yes, you could start with it here and see what feedback you get?

The idea I had for Ebib was to basically copy Zotero's field names and mappings. It has the advantage of not having to come up with a mapping myself, plus people may be familiar with it.

Up to you, but do you want to open a new issue for this?

I probably should. :slightly_smiling_face: But ATM it seems that UI mappings are only going to be used in Ebib, so I'm leaning towards including the mapping there.

tmalsburg commented 3 years ago

Hi both, I wasn't aware that the field names are different for CSL json. This complicates matters a bit. I was hoping that people would be able to use bibtex and json sources side by side but the current design of bibtex-completion assumes one relevant set of field names. In the past this has already been a problem for people who're using biblatex format (e.g. date instead of year). I think bibtex-completion will need a complete redesign in order to support all three formats simultaneously.

It would be relatively easy to support just CSL json or just biblatex, but I doubt that this is going to be a satisfying solution for anyone. For instance, even people who are personally using biblatex sometimes need to work with bibtex because many journals require it. Hm ...

tmalsburg commented 3 years ago

Perhaps we'll need separate biblatex-completion and csljson-completion. Users of Helm can easily fuse these together at the UI level. Not sure it will be possible with ivy and completing-read UIs.

bdarcus commented 3 years ago

I do indeed think ideally one can mix:and-match sources without hassle, for users or developers alike.

That year/date issue has bitten me. With CSL, you also get issued etc.

So we somehow need a mechanism to do this mapping, in an easy and performant way.

tmalsburg commented 3 years ago

So we somehow need a mechanism to do this mapping, in an easy and performant way.

One stupid simple solution would be to convert the CSL json to BibTeX (on disk) and just use that for bibtex-completion purposes. This is what we currently do with org-bibtex. Not pretty but it works and keeps things manageable. I guess any user of CSL json probably wants a BibTeX version anyway for LaTeX authoring?

joostkremers commented 3 years ago

So we somehow need a mechanism to do this mapping, in an easy and performant way.

One stupid simple solution would be to convert the CSL json to BibTeX (on disk) and just use that for bibtex-completion purposes. This is what we currently do with org-bibtex.

What do you use for conversion? Or am I misunderstanding and isn't it the case that org-bibtex can convert CSL-JSON to BibTeX?

Not pretty but it works and keeps things manageable. I guess any user of CSL json probably wants a BibTeX version anyway for LaTeX authoring?

I wouldn't assume that. Pandoc makes it possible to author publications without going through LaTeX (LaTeX isn't even needed for PDF output, though it's still an option), and with org-cite, Org mode will, as well.

bdarcus commented 3 years ago

@tmalsburg - correct me if I'm wrong, but the key places where you use the field names to pull data are with the bibtex-completion-get-value calls?

So what if he added a parsebib-get-value analog, which handled that mapping?

It might need to settle on one set of field names (edit: at least as fallback?), say biblatex, but then you could do:

(parsebib-get-value "date")

... and it would pull a "year" value from bibtex, or an issued from csljson.

joostkremers commented 3 years ago

So we somehow need a mechanism to do this mapping, in an easy and performant way.

Suggestions? The thing is, I'm not even sure how bibtex-completion works, exactly. Personally, I see two ways to convert CSL-JSON fields to BibTeX / biblatex:

  1. Convert fields while parsing.
  2. Convert fields on the fly when needed.

Option 1. means that all CSL-JSON fields are converted (optionally only the ones explicitly requested) and their new field names stored back in the individual entry alists. I suspect that would be a highly expensive operation, performance-wise.

Option 2. can probably be implemented more economically, even if it needs to be done for the entire database, because the alists themselves do not need to be modified. But it depends on how bibtex-completion accesses the data, which, as I mentioned, I do not know...

joostkremers commented 3 years ago

@tmalsburg - correct me if I'm wrong, but the key places where you use the field names to pull data are with the bibtex-completion-get-value calls?

So what if he added a parsebib-get-value analog, which handled that mapping?

It might need to settle on one set of field names, say biblatex, but then you could do:

(parsebib-get-value "date")

... and it would pull a "year" value from bibtex, or an "issued" from csljson.

Yes, that's basically what I mean. :slightly_smiling_face:

tmalsburg commented 3 years ago

What do you use for conversion? Or am I misunderstanding and isn't it the case that org-bibtex can convert CSL-JSON to BibTeX?

The conversion is the responsibility of the user. bibtex-completion just assumes that there is an analogous .bib for every .org. How exactly users convert to BibTeX I don't know since I'm not using org-bibtex myself.

I wouldn't assume that. Pandoc makes it possible to author publications without going through LaTeX

Sorry, I should have said "most users". Even though it's cool that org is gaining citation capabilities that don't rely on LaTeX, my suspicion is that LaTeX is going to remain the primary way to handle citations for most. I may be totally wrong of course but LaTeX is pretty deeply engrained in the academic publishing system.

tmalsburg commented 3 years ago

So what if he added a parsebib-get-value analog, which handled that mapping?

The problem is that the mapping may not be 1-to-1. An example is bibltex date vs. BibTeX day, month, year. Similar issues may arise with CSL (which I'm not familiar with yet).

bdarcus commented 3 years ago

It's true data like names and dates are more complicated than simple strings. But it shouldn't be hard to address those.

joostkremers commented 3 years ago

It's true data like names and dates are more complicated than simple strings. But it shouldn't be hard to address those.

Are there other fields where conversion would be problematic? My impression is that there aren't. And I agree it shouldn't be hard to come up with something.

I could come up with a parsebib-get-value that assumes biblatex fields and translates them when necessary. I'd start small, with the most important fields, and then add more mappings when they come up.

tmalsburg commented 3 years ago

Are there other fields where conversion would be problematic?

I don't know. You're probably in a better position to tell since you know CSL and I don't.

I could come up with a parsebib-get-value that assumes biblatex fields and translates them when necessary.

But wouldn't it be confusing for a user with a CSL bibliography if they had to specify formatting strings using bibtex field names? Equally, I'd find it annoying if I had to learn biblatex terminology even though my bibliography is in bibtex. It may be the best solution anyway, but it's not pretty.

By the way, I would use BibTeX field names as the default, not biblatex. My experience is that the majority of users are using the BibTeX format (e.g. year instead of date) even if they're using biber/biblatex in their LaTeX workflow. Plus, Crossref and basically all journals export classic BibTeX.

tmalsburg commented 3 years ago

Option 2. can probably be implemented more economically, even if it needs to be done for the entire database, because the alists themselves do not need to be modified. But it depends on how bibtex-completion accesses the data, which, as I mentioned, I do not know...

Performance doesn't just depend on bibtex-completion, but also on the UI frontend. Helm is pretty clever in only formatting entries that actually show up on the screen, whereas Ivy (I think) formats all entries (last time I checked). Not sure about the completing-read UI.

bdarcus commented 3 years ago

Are there other fields where conversion would be problematic?

I don't know. You're probably in a better position to tell since you know CSL and I don't.

In the future, probably titles.

I could come up with a parsebib-get-value that assumes biblatex fields and translates them when necessary.

But wouldn't it be confusing for a user with a CSL bibliography if they had to specify formatting strings using bibtex field names?

  1. I don't think a user should typically have to worry much about field names at all.
  2. If such a user does, to change their display config, it's likely they know little about the csl field names either, so I doubt it would be an issue.

Equally, I'd find it annoying if I had to learn biblatex terminology even though my bibliography is in bibtex. It may be the best solution anyway, but it's not pretty.

By the way, I would use BibTeX field names as the default, not biblatex. My experience is that the majority of users are using the BibTeX format (e.g. year instead of date) even if they're using biber/biblatex in their LaTeX workflow. Plus, Crossref and basically all journals export classic BibTeX.

This is what I was hinting at above with my note on "fallback".

Bibtex is the older, more limited, format.

But this mechanism could include both. I guess performance could become an issue, depending on the details ...

Performance doesn't just depend on bibtex-completion, but also on the UI frontend. Helm is pretty clever in only formatting entries that actually show up on the screen, whereas Ivy (I think) formats all entries (last time I checked). Not sure about the completing-read UI.

In bibtex-actions, I can't use bibtex-completion-candidates without sacrificing things that users want (like match highlighting), so have to recreate my own pre-formatted candidates from that.

It might in theory be better for me to just use parsebib-parse, etc for this directly, but I do need to format the full candidate list upfront, so this would likely be a bottleneck.

joostkremers commented 3 years ago

Are there other fields where conversion would be problematic?

I don't know. You're probably in a better position to tell since you know CSL and I don't.

Well, I've looked at the spec, but that's about it. :slightly_smiling_face:

But wouldn't it be confusing for a user with a CSL bibliography if they had to specify formatting strings using bibtex field names?

It would still be possible to access the data using the json field names, at least if conversion is done on the fly, not in the database itself. So you could do (parsebib-get-value 'year <json-entry>) and get the value of the issued field, but you could also do (parsebib-get-value 'issued <json-entry>). The only thing that would not be possible is (parsebib-get-value 'issued <bib-entry>). (Unless I also add a mapping from CSL-JSON fields to biblatex fields, of course.)

Equally, I'd find it annoying if I had to learn biblatex terminology even though my bibliography is in bibtex. It may be the best solution anyway, but it's not pretty.

From what I can tell by looking at the helm-bibtex readme, users don't normally have to deal with the field names at all, except when they want to customise the search display. And in that case it's probably safe to assume they know the underlying format well enough.

It would mainly be a convenience for you, so that you don't have to write things like:

(or 
 (bibtex-completion-get-value 'year entry)
 (bibtex-completion-get-value 'date entry)
 (bibtex-completion-get-value 'issued entry))

Instead, you could write

(parsebib-get-value 'year entry)

and parsebib would make sure you get the right value, regardless of the format of entry.

joostkremers commented 3 years ago

By the way, I would use BibTeX field names as the default, not biblatex.

Since biblatex is the more expressive format, I would prefer those, because it should in theory be easier to go from biblatex field to BibTeX field than vice versa. Though in practice it might not matter that much.

bdarcus commented 3 years ago

You do have the stringify functions, where you can handle the more complex fields.

How would that interact with this; say if you wanted to specify a year or month for a date, main title from a title, last names for authors, etc.?

tmalsburg commented 3 years ago

It would still be possible to access the data using the json field names,

Wouldn't this create room for ambiguity? Say BibTeX has field A that maps to field B in CSL, but CSL also has a field A, then it's not clear which A is being requested. Not sure whether such a scenario will arise, perhaps not, but it's at least technically possible.

joostkremers commented 3 years ago

It would still be possible to access the data using the json field names,

Wouldn't this create room for ambiguity? Say BibTeX has field A that maps to field B in CSL, but CSL also has a field A, then it's not clear which A is being requested. Not sure whether such a scenario will arise, perhaps not, but it's at least technically possible.

True. I don't think there are many occasions, but at least the type field comes to mind, which biblatex uses to record subtypes of certain entry types (e.g., Thesis with type = "PhD Thesis"), while CSL-JSON uses it to record the entry type.

Perhaps it'd be possible to check for which fields this risk arises and handle them specially.

It's kinda up to you if you want such a mapping or not. :slightly_smiling_face: In Ebib, I already distinguish between BibTeX and biblatex files with their different sets of entry types and fields, it won't be much of a problem to add a third database format. So ATM I don't think I'd be using this mapping.

joostkremers commented 3 years ago

You do have the stringify functions, where you can handle the more complex fields.

How would that interact with this; say if you wanted to specify a year or month for a date, main title from a title, last names for authors, etc.?

Not sure what you mean... What scenario do you have in mind, exactly?

bdarcus commented 3 years ago

Not sure what you mean... What scenario do you have in mind, exactly?

Well, bottom line, this is the two templates I have with their defaults.

So of note, in this line:

'((t . "${author:20}   ${title:48}   ${year:4}"))

... "author" is actually formatted with some bibtex-completion function that prints a list of author last names, while "year" obviously will pull bibtex "year", but also (via some other bibtex-completion code) biblatex "date", and so the "4" just pulls the first four characters.

So bibtex-completion already does some mapping and data formatting.

I was just wondering how similar could work with a parsebib-get-value function.

Perhaps "year" would become "date", but template would otherwise stay the same?

joostkremers commented 3 years ago

I was just wondering how similar could work with a parsebib-get-value function.

Perhaps "year" would become "date", but template would otherwise stay the same?

My thinking now is that parsebib-get-value would take an optional argument that controls this behaviour. If nil, it would just call assoc-string and return the value. If non-nil, it would also try alternative fields, based on some schema.

I would probably keep year, but if that field doesn't exist, check date as well. If that doesn't yield anything, issued would be tried next.

Note that this wouldn't necessarily just be to accommodate the different formats. If author yields nil, one may well want to get the editor field instead.

bdarcus commented 3 years ago

I was just wondering how similar could work with a parsebib-get-value function. Perhaps "year" would become "date", but template would otherwise stay the same?

My thinking now is that parsebib-get-value would take an optional argument that controls this behaviour. If nil, it would just call assoc-string and return the value. If non-nil, it would also try alternative fields, based on some schema.

Yeah, was thinking the same. Something like:

(parsebib-get-value 'author entry 'short)

... so we can get rendering like:

Screenshot from 2021-05-18 09-09-07

Edit: could also do:

(parsebib-get-value 'title entry 'short)

... which now could pull csl-json title-short if available, on in the future the main title; could split a full title, etc.

Templates could be adapted to support that something like:

{author:15/short}

I would probably keep year, but if that field doesn't exist, check date as well. If that doesn't yield anything, issued would be tried next.

Note that this wouldn't necessarily just be to accommodate the different formats. If author yields nil, one may well want to get the editor field instead.

Right!

tmalsburg commented 3 years ago

Hi both. I'm moving later this week. Lots of boxes to pack. I will catch up with this thread next week.

bdarcus commented 3 years ago

FYI, @joostkremers, I've opened a linked issue for how I'd adapt bibtex-actions to this.

Happy to experiment once you have parsebib-get-value working.

I'd think a similar approach would work for bibtex-completion.

Caveat: some bibtex-completion functions, which bibtex-actions depends on, currently depend on it's parsing code. See, for example, bibtex-completion-show-entry. But looks like those should be easy enough to adapt to use parsebib-get-value there.

I also retitled this issue.

Good luck with the moving @tmalsburg!

joostkremers commented 3 years ago

FYI, @joostkremers, I've opened a linked issue for how I'd adapt bibtex-actions to this.

Cool. I've subscribed so I'll be kept up-to-date.

Happy to experiment once you have parsebib-get-value working.

About that: I'm not sure we're exactly on the same page here... :slightly_smiling_face: You mention being able to do:

(parsebib-get-value 'author entry 'short)

But I'm not sure what short should mean. My idea was to have something like:

(parsebib-get-value 'author entry 'alternatives)

where alternatives indicates that if author doesn't exist in entry, it should try editor next; and if you pass year:

(parsebib-get-value 'year entry 'alternatives)

you'll get the value of date if year does not exist, and of issued in case date also does not exist.

If you need parsebib-get-value to do more, feel free to let me know.

Good luck with the moving @tmalsburg!

Hear, hear!

bdarcus commented 3 years ago

I misread/got ahead of you earlier.

To clarify, the 'alternatves idea is a good one, and useful.

What I was referring to there is obviously different, if related, but hopefully self-explanatory.

The author string you have currently for "display", for example, includes the full names, and so something like that example would just ask for the short names; a display variant, if you will.

Not sure it's needed, but are you thinking to have those alternatives configurable somehow?

joostkremers commented 3 years ago

The author string you have currently for "display", for example, includes the full names, and so something like that example would just ask for the short names; a display variant, if you will.

That would be possible, of course. It could also be done during parsing, BTW. I don't know which option would be better.

Not sure it's needed, but are you thinking to have those alternatives configurable somehow?

The way JSON name fields are stringified during parsing is configurable with the variable parsebib-json-name-field-template. That could be generalised.

bdarcus commented 3 years ago

The way JSON name fields are stringified during parsing is configurable with the variable parsebib-json-name-field-template.

That's perfect, and is the main thing I need for this.

That could be generalised.

So basically just add a new defvar as needed? You have one for names and another for date, which is really all we need ATM I think.

Now reading the docs again, and this section in particular, is the idea that you are normalizing on EDTF for dates? That sentence that begins "Date fields (as defined by parsebib--json-date-fields) are converted" is a little confusing to me.

joostkremers commented 3 years ago

Now reading the docs again, and this section in particular, is the idea that you are normalizing on EDTF for dates? That sentence that begins "Date fields (as defined by parsebib--json-date-fields) are converted" is a little confusing to me.

The code might be clearer... :worried: The variable parsebib--json-date-field holds a list of fields that are date fields. If such a date field's value is a string, it is not modified. If it is an object, it is converted to a string using the template "{circa }{season }{start-date}{/end-date}{literal}{raw}". Unlike name fields, however, that template isn't let-bindable, because it doesn't apply to the fields in the object directly.

The details are in the function parsebib--json-stringify-date-field, but basically, if a date field just contains a date or a year, the resulting string has the form "2021-4-22" or "2021". If season or circa are present, it may also be "Summer 2012" or ca. 2000, etc.

parsebib--json-stringify-date-field has an extra argument short, which, if t, returns just the year, which I guess is what you need.

bdarcus commented 3 years ago

is the idea that you are normalizing on EDTF for dates?

If it is an object, it is converted to a string using the template "{circa }{season }{start-date}{/end-date}{literal}{raw}".

So that template is similar to EDTF.

bdarcus commented 3 years ago

Any estimate of when you can get back to and merge this @joostkremers?

With org-cite now merged, would be great to get json support in bibtex-completion et al.

tmalsburg commented 3 years ago

With org-cite now merged, would be great to get json support in bibtex-completion et al.

I'm not sure that bibtex-completion can accomodate csl. Csl seems too differ in too many respects and breaks too many assumptions we're making in bibtex-completion. Bibtex-completion has difficulties accommodating even the biblatex dialect, which is not terribly different from bibtex. If we force biblatex and csl into bibtex-completion, my worry is that the code becomes buggy and impossible to understand and maintain. My impression is that we may need separate csl-completion and biblatex-completion modules that can be plugged in elsewhere. For compatibility, their interfaces should mirror the API of bibtex-competion and there is perhaps also some code that can be shared. I think this would give us much better support for each individual format, more flexibility, and more reliable / correct code.