citation-style-language / schema

Citation Style Language schema
https://citationstyles.org/
MIT License
187 stars 60 forks source link

Add abbreviation list mechanism #321

Open denismaier opened 4 years ago

denismaier commented 4 years ago

The question of abbreviations, I suggest should be handled by adopting lists of abbreviations for events and container titles. Zotero currently will automatically generate journal abbreviations from the MEDLINE list and has some hack-y methods to specify a different list. We should discuss whether a formally specified abbreviation list mechanism makes sense. I think it does.

Originally posted by @bwiernik in https://github.com/citation-style-language/schema/pull/268#issuecomment-659661693

I think we should absolutely add a mechanism for abbreviations. CSL 1.0.2 will improve the citation-label feature and add a classic item type. For this an abbreviation list mechanism would be useful.

bdarcus commented 4 years ago

Frank disagrees here.

bwiernik commented 4 years ago

This is sort of similar to the situation of multi-section bibliographies. We might include some hinting in the style (e.g., noting what abbreviations list is used), but it can probably stay at the calling application level for supplying the abbreviations.

denismaier commented 4 years ago

Yes. Currently there seems to exist an unofficial abbreviation list format supported by citeproc-js and pandoc-citeproc. Could be a good idea to formalize this so calling applications can supply them in this format.

bdarcus commented 4 years ago

We could define a simple JSON/YAML schema? What is that format they use?

On Fri, Jul 17, 2020 at 1:18 PM Denis Maier notifications@github.com wrote:

Yes. Currently there seems to exist an unofficial abbreviation list format supported by citeproc-js and pandoc-citeproc. Could be a good idea to formalize this so calling applications can supply them in this format.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/citation-style-language/schema/issues/321#issuecomment-660235535, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAI3VTGSW43D5WIVIJQILR4CBWHANCNFSM4O6PZOZA .

denismaier commented 4 years ago

Here's how it looks:

{ "default":  { 
"container-title": {
 "Lloyd's Law Reports": "Lloyd's Rep", 
"Estates Gazette": "EG", 
"Scots Law Times": "SLT" 
} } }

See https://github.com/jgm/pandoc-citeproc/blob/master/man/pandoc-citeproc.1.md

I'm sure @fbennett has more info somewhere, but I couldn't find it at the moment...

Also, since we now have a citation-key this could be extended.

bwiernik commented 4 years ago

This is Zotero's MEDLINE abbreviations JSON file: https://raw.githubusercontent.com/zotero/zotero/master/resource/schema/abbreviations.json

bdarcus commented 4 years ago

What's the significance of "default" there? What would be non-default

bwiernik commented 4 years ago

I believe that is indicating item type, with default being "all others".

Edit: Actually, no, it appears to be a jurisdiction/locale indicator: https://github.com/Juris-M/citeproc-js/blob/7310d8400166221dd3e6d24767938aff283f7f32/src/util_transform.js#L104

bwiernik commented 4 years ago

Here is where Zotero loads the abbreviations file to pass to the processor: https://github.com/zotero/zotero/blob/91ca6d2ba6cbe6256a0adc97f3d3de05a9bcb833/chrome/content/zotero/xpcom/cite.js#L435

denismaier commented 4 years ago

In addition to the schema question I think this should address these scenarios: a) an abbreviation file on a per document basis. b) an abbreviation file on a per style basis. c) an abbreviation file on a more general basis, say per discipline d) a combination of a) b) and c)

Multiple abbreviation lists should be possible if course.

bwiernik commented 4 years ago

The gist of what citeproc-js/pandoc-citeproc currently do:

  1. Take an abbreviations file
  2. Do abbreviation replacement based on that file when form="short"

The extent of support I think might be reasonable:

  1. Formalize this structure:
    1. The existence structure of abbreviations files
    2. The logic for replacement based on current behavior
  2. Provide an informational element in the style info indicating what abbreviations list a style uses.
    1. This wouldn't have to be used by the citation processor, but a calling application might use it to supply different abbreviations lists, e.g., MEDLINE for medical journals but Society of Biblical Literature's special list for that style.
denismaier commented 4 years ago

Also, since we now have a citation-key this could be extended.

Here, I was thinking about something like this (using yaml for simplicity):

citation-label:
  - id: item-1
    citation-label: XYZ

So, this would add citation-label for a specific item based on the citation-key.

bwiernik commented 4 years ago

I don't understand what that would be for? Why not just add citation-label to the item data? Can you give a real example?

denismaier commented 4 years ago

Why not just add citation-label to the item data?

Because citation-label may be used only in this document. Say, your writing an essay on Freud's "Das Unbehagen in der Kultur", and want to use the label "UK" for this text. But you will use this label only in this document or maybe in a book Freud. But in another document you won't use this label.

denismaier commented 4 years ago

Just to add to this: This is mainly a problem for GUI apps where you'll build a huge career long database. That's much less of an issue if you're using pandoc and the like.

bwiernik commented 4 years ago

Hmm, three thoughts:

  1. That is all about populating the citation-label field. That’s a fairly different case from automatically abbreviating journal or conference titles, which is about looking up form="short" content for a field (container-title) by the full length content.
  2. As you say, this is more an issue for GUI citation managers rather than a system like pandoc.

With that in mind, I’d suggest that citation-label is a problem for the calling application to solve. Either the user can enter it into the data, or the application can provide a way to fill the citation-label field for the item in the document (e.g., in Zotero, if a style calls citation-label, a text entry box could be added to the locator/suffix/prefix screen).

denismaier commented 4 years ago

Ok. Then this means we formalize the current structure.

With that in mind, I’d suggest that citation-label is a problem for the calling application to solve. Either the user can enter it into the data, or the application can provide a way to fill the citation-label field for the item in the document (e.g., in Zotero, if a style calls citation-label, a text entry box could be added to the locator/suffix/prefix screen).

How do you think we should/could communicate this to calling application devs? Like: "CSL has a special variable citation-label that can be used in X and Y ways. For optimal UX please provide ways to do Z."