lsegal / yard

YARD is a Ruby Documentation tool. The Y stands for "Yay!"
http://yardoc.org
MIT License
1.94k stars 397 forks source link

1.9.3 can not handle non-ascii characters #514

Closed posativ closed 12 years ago

posativ commented 12 years ago

With 1.9.3-p0 (and -p125) and yardoc 0.7.5 an mdash (aka –) results in three question marks while with ruby 1.8.7 everything is fine (using OS X 10.7 and rvm).

lsegal commented 12 years ago

Are you running with --charset utf-8? Ruby 1.9 is much more "encoding-aware" than 1.8 is, so you definitely need to care when your file encodings differ. If this is in a rb file, you'd need # encoding: utf-8 in the header which should solve it-- if it's in a readme/extra file, you can add a # @encoding utf-8 line to the top of the file as well. Of course, using --charset will switch everything over to unicode, so if that's okay with you, you should do that.

posativ commented 12 years ago

Using # encoding: utf-8 worked, thanks for the hint. Altough I thought this are the default settings...

lsegal commented 12 years ago

Nope, in Ruby, US-ASCII is the default encoding for file data, for compatibility reasons. Though I agree, it's a silly default :)

String data in Ruby will use your ENV settings to get the default encoding though, namely the LANG environment variable, so that one is configurable.

posativ commented 12 years ago

My LANG ends with "UTF-8", that works with Ruby 1.8 but not with 1.9. I'm quite happy with the # encoding solution!