inukshuk / citeproc-ruby

A Citation Style Language (CSL) Cite Processor
101 stars 22 forks source link

Weird chars in pages for chicago citation #50

Closed dazza-codes closed 6 years ago

dazza-codes commented 6 years ago

Using citeproc-ruby (1.1.8)

Method used to generate a citation:

    def generate_csl_citation(csl_citation_data, csl_style)
      item = CiteProc::CitationItem.new(id: 'sulpub')
      item.data = CiteProc::Item.new(csl_citation_data)
      csl_renderer = CiteProc::Ruby::Renderer.new(format: 'html')
      csl_renderer.render item, csl_style.bibliography
    end

example data:

conference_pub_in_journal_hash
=> {:title=>"My test title",
 :type=>"paper-conference",
 :articlenumber=>33,
 :pages=>"33-56",
 :author=>[{:name=>"Smith, Jack", :role=>"editor"}, {:name=>"Sprat, Jill", :role=>"editor"}, {:name=>"Jones, P. L."}, {:firstname=>"Alan", :middlename=>"T", :lastname=>"Jackson"}],
 :year=>"1987",
 :supplement=>"33",
 :publisher=>"Some Publisher",
 :journal=>{:name=>"Some Journal Name", :volume=>33, :issue=>32, :year=>1999},
 :conference=>{:name=>"The Big Conference", :year=>2345, :number=>33, :location=>"Knoxville, TN", :city=>"Knoxville", :statecountry=>"TN"}}

# same data mapped into a CSL doc
csl_doc
=> {"id"=>"sulpub",
 "type"=>"article-journal",
 "author"=>[{"family"=>"Jones", "given"=>"  P. L."}, {"family"=>"Jackson", "given"=>" Alan T."}],
 "title"=>"My test title",
 "chapter-number"=>33,
 "page"=>"33-56",
 "publisher"=>"Some Publisher",
 "container-title"=>"Some Journal Name",
 "volume"=>33,
 "issue"=>32,
 "issued"=>{"date-parts"=>[["1987"]]},
 "number"=>33,
 "event"=>"The Big Conference",
 "event-date"=>{"date-parts"=>[[2345]]},
 "event-place"=>"Knoxville,TN"}

Weird page characters

# page data in the CSL doc hash
conference_pub_in_journal_hash[:pages]
=> "33-56"
conference_pub_in_journal_hash[:pages].encoding
=> #<Encoding:UTF-8>
conference_pub_in_journal_hash[:pages].bytes
=> [51, 51, 45, 53, 54]
# same chars in the csl_doc
csl_doc['page'].bytes
=> [51, 51, 45, 53, 54]

# the result of mapping into a Chicago citation using the method above with
# CSL::Style.load('chicago-author-date')
chicago_citation
=> "Jones,   P. L., and Alan T. Jackson. 1987. “My Test Title.” <i>Some Journal Name</i> 33 (32). Some Publisher: 33–56."
chicago_citation.encoding
=> #<Encoding:UTF-8>

# the pages have some weird characters in the page separator
chicago_citation.bytes[-8..-2]
=> [51, 51, 226, 128, 147, 53, 54]
dazza-codes commented 6 years ago

Maybe this is not a bug because it's supposed to be an en-dash? ie. chicago style translates a regular hyphen char (code 45) into an en-dash (code 226-218-147).

inukshuk commented 6 years ago

Yes, en-dash is the default page range delimiter specified by CSL and it is the one set by the (default) English locale. You can override it in your style or in your locale.