inukshuk / citeproc-ruby

A Citation Style Language (CSL) Cite Processor
103 stars 23 forks source link

Problems with umlauts in author names #8

Closed gladwig closed 10 years ago

gladwig commented 12 years ago

Hi,

I tried formatting some BibTeX files with bibtex-ruby and citeproc-ruby and encountered a problem when author names contain umlauts (ä, ö, ü). The problem seems to lie with citeproc-ruby (note the family name with an umlaut):

book = {
        'author' => [{ 'given' => 'Edgar Allen', 'family' => 'Müller' }],
        'title' => 'Poetry, Tales, and Selected Essays',
        'type' => 'book',
        'issued' => { 'date-parts' => [[1996]] },
        'editor' => [{ 'family' => 'Quinn', 'given' => 'Patrick F.'}, { 'family' => 'Thompson', 'given' => 'G.R.' }],
        'publisher' => 'Library of America',
        'publisher-place' => 'New York'
    }
>> CiteProc.process(book)
=> "M\303\274llerEdgar. (1996). Poetry, Tales, and Selected Essays.  (P. F. Quinn & G. R. Thompson, Eds., , Tran.). New York: Library of America."

As you can see the family and given name are mangled. Without the umlaut everything is fine:

book = {
  'author' => [{ 'given' => 'Edgar Allen', 'family' => 'Mueller' }],
  'title' => 'Poetry, Tales, and Selected Essays',
  'type' => 'book',
  'issued' => { 'date-parts' => [[1996]] },
  'editor' => [{ 'family' => 'Quinn', 'given' => 'Patrick F.'}, { 'family' => 'Thompson', 'given' => 'G.R.' }],
  'publisher' => 'Library of America',
  'publisher-place' => 'New York'
}
>> CiteProc.process(book)
=> "Mueller, E. A. (1996). Poetry, Tales, and Selected Essays.  (P. F. Quinn & G. R. Thompson, Eds., , Tran.). New York: Library of America."
inukshuk commented 12 years ago

Which Ruby version did you use for the tests? I just ran this on 1.9.3 and the umlaut works as expected.

gladwig commented 12 years ago

Ah, sorry for not mentioning that. This the Mac OS X Lion default, I think:

ruby 1.8.7 (2010-01-10 patchlevel 249) [universal-darwin11.0]

inukshuk commented 12 years ago

Yes, it's very likely that there are still unicode issues in 1.8. Do you have the active support gem installed? That may help. Otherwise, there is not much we can do currently, because citeproc-ruby is in the middle of a rewrite. However, there are options: