jgm / pandoc-citeproc

Library and executable for using citeproc with pandoc
BSD 3-Clause "New" or "Revised" License
291 stars 61 forks source link

Individual locales for entries #294

Closed kugelblitz closed 7 years ago

kugelblitz commented 7 years ago

Hello,

In a far big cold country called Russia, there is a national standard that requires to display bibliographical references in a certain way.

In particular, the references according to that standard should have English locale settings (e.g. "Ed.", "pp.", "Vol." etc) when their titles are in English, and Russian locale settings ("под ред.", "С.", Т.") when the titles are in Russian.

Since English and Russian references can be in the same document, one global "default locale" option will not do the trick.

I found no way of setting individual locales per article, be it .bibtex file or direct YAML input in the source Markdown file. The most obvious would be the "language" entry in the .bibtex file; however, as I see in the pandoc-citeproc source, it is treated as the hyphenation mode, not locale.

How one can get individual locale settings?

Thanks

jgm commented 7 years ago

I have a vague memory that something like this is possible, but I don't recall the details. Hoping @nickbart can help.

njbart commented 7 years ago

I tend to think this is not possible in standard CSL.

It seems to be available in Multilingual Zotero (MLZ) (which uses an extension of the CSL standard called CSL-M, documented here). The cs:layout extension, which as far as I understand can do what Russian conventions require, is described here.

citeproc-js implements both CSL and CSL-M, so I’d imagine pandoc-citeproc could do the same, though I realize of course this would involve substantial effort.

kugelblitz commented 7 years ago

I do not understand something, most probably because I am new to CSL. What the cs:layout extension exists for? The link reads

In the example above, an item with en, es or de (or de-AT) set in the language variable will be render by the layout-citation-roman macro, with locale terms set to the appropriate language.

Why the macros when we already have these nice tags with an attribute to check against the language setting? In this code

  <locale xml:lang="en">
    <terms>
      <term name="editor" form="verb-short">ed.</term>
    </terms>
  </locale>
  <locale xml:lang="ru">
    <terms>
      <term name="editor" form="verb-short">под ред.</term>
    </terms>
  </locale>

I see nothing that prohibits the citation processor to check against xml:lang or whatever basing in the contents of the bibliography file, since it already respects the default-locale setting. Am I missing some XML transformation limitations, or I got the whole idea wrong?

On another note, do you know whether citeproc-js can be used as a filter for pandoc?

Many thanks for your help.

Regards,

Dmitry

njbart commented 7 years ago

My understanding is that the current CSL specs and practices do not allow processors to test for the content of CSL variables. (One of the few exceptions is testing whether the language variable starts with “en” or is empty, but this is built into the processors and not accessible via CSL style files.) The CSL folks insist this is a feature not a bug, so I understand the only possible workaround is to use an otherwise unused variable, e.g., “Extra”, populate it with some content for Russian entries (and nothing for all others), and test in a CSL style file for this variable’s presence or absence (see, e.g., https://forums.zotero.org/discussion/57223/preferred-method-fo-indicating-non-latin-language).

kugelblitz commented 7 years ago

Nick,

Thank you very much for the explanation. You raise two important points.

  1. Testing if language field starts with "en". That would be enough for my purposes, since I have only English and Russian sources, with very few exceptions. But from what I read in the CSL docs, I gather that I can not really test that in the CSL. The language field affects only the casing of the sentences and nothing else.

  2. The solution with the "Extra" field would be OK for me. I can manage to insert that field into my bibliography entries. I do not know how to test a presence of a variable in the CSL, but I can study and learn that.

But first, it will be important to know if it is worth trying with the present state of pandoc-citeproc, and here I beg for your advice again. Will pandoc-citeproc pass my "Extra" field from a bibtex entry to the CSL processor? Or from YAML source?

Best regards,

Dmitry

On Fri, Jul 7, 2017 at 7:56 PM, Nick Bart notifications@github.com wrote:

My understanding is that the current CSL specs and practices do not allow processors to test for the content of CSL variables. (One of the few exceptions is testing whether the language variable starts with “en” or is empty, but this is built into the processors and not accessible via CSL style files.) The CSL folks insist this is a feature not a bug, so I understand the only possible workaround is to use an otherwise unused variable, e.g., “Extra”, populate it with some content for Russian entries (and nothing for all others), and test in a CSL style file for this variable’s presence or absence (see, e.g., https://forums.zotero.org/ discussion/57223/preferred-method-fo-indicating-non-latin-language).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jgm/pandoc-citeproc/issues/294#issuecomment-313736535, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS9Ku0lmR6WoGC37mG_9ZEyZGQjfEEWks5sLmNXgaJpZM4OOlAP .

njbart commented 7 years ago

Testing if language field starts with "en". […] But from what I read in the CSL docs, I gather that I can not really test that in the CSL.

Correct.

Will pandoc-citeproc pass my "Extra" field from a bibtex entry to the CSL processor? Or from YAML source?

Yes. Note that “Extra” is the name of the field in Zotero. In bibtex, biblatex, CSL JSON and CSL YAML the field/variable name is “note”. Actually, you could use any CSL variable listed in the specs that is not already in use in your CSL style file. On the bibtex side, any field that pandoc-citeproc can map to a suitable CSL variable could be used. (Test this with pandoc-citeproc -y myfile.bib.)

I do not know how to test a presence of a variable in the CSL, but I can study and learn that.

<choose>
  <if variable="note">
    insert Russian stuff
  </if>
  <else>
    insert non-Russian stuff
  </else>
</choose>

should do the trick. The Zotero forum is a good place to ask for advice on CSL style file issues, too.

kugelblitz commented 7 years ago

Nick,

I have obtained what I need, using the "note" field for Russian sources. and the <choose> tag. What I have done is a very dirty hack, though, because <choose> does work only for rendering elements, not for <locale> <terms> setting. So I had to keep the locale and insert <choose> all over the place. I had to make my own Russian variants of terms, like "pages-ru", because there is no option to get just "pages" term for a different locale. As a consequence, I could not use <labels> because it assumes that the variables' names match terms' names.

Anyway, I have what I need for now, thank you very much!

Regards,

Dmitry

ghost commented 7 years ago

Dmitry, can you put an example of such hacked CSL, please? Thanks

kugelblitz commented 7 years ago

Here: gost-r-7-0-5-2008-numeric-mod.csl.txt

The comments there are of original author (or someone after the original, it is hard to say).

Mine are the clumsy fragments like this

          <if variable="note">
            <text term="volume-ru" form="short" text-case="sentence"/>
            <text variable="volume"/>
          </if>
          <else>
            <text term="volume" form="short" text-case="sentence" plural="false"/>
            <text variable="volume"/>
          </else>
ghost commented 7 years ago

Thank you very much!

kugelblitz commented 6 years ago

It seems that I can not localise "et al." with this <choose> hack, because "et al" handling is buried inside the Haskell code.