rails / thor

Thor is a toolkit for building powerful command-line interfaces.
http://whatisthor.com/
MIT License
5.13k stars 552 forks source link

Encoding::CompatibilityError caused by non-latin characters in ERB template #481

Open kinkou opened 9 years ago

kinkou commented 9 years ago

Hi, I'm using Thor as generator in Rails. Some of my templates contain non-latin (cyrillic) characters, and it causes template method to fail. Here's the backtrace (I omitted some lines and shortened the paths):

(erb):14:in `concat': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
    from (erb):14:in `template'
    from .../ruby-2.1.4/lib/ruby/2.1.0/erb.rb:850:in `eval'
    from .../ruby-2.1.4/lib/ruby/2.1.0/erb.rb:850:in `result'
    ...
    from .../gems/thor-0.19.1/lib/thor/actions/file_manipulation.rb:115:in `template'
    ...
    from .../gems/railties-4.1.8/lib/rails/generators/named_base.rb:25:in `template'
    from .../lib/generators/my_generator/my_generator.rb:23:in `generate_my_stuff'

The reason must be that the template is read as ASCII-8BIT:

# lib/thor/actions/file_manipulation.rb:116
content = ERB.new(::File.binread(source), nil, "-", "@output_buffer").result(context)

..as removing non-latin chars from the template fixes the issue (as well as changing binread to read – I wonder why you had to use binread here).

What's at fault here, Thor or ERB?

jpgeek commented 8 years ago

Same issue here. Using a customized template with Rails default scaffold view generator:

(erb):4:in `concat': incompatible character encodings: ASCII-8BIT and UTF-8 (Encoding::CompatibilityError)
        from (erb):4:in `template'                                                        
        ruby/2.2.0/erb.rb:863:in `eval'                                                   
        ruby/2.2.0/erb.rb:863:in `result'                                                 
thor-0.19.1/lib/thor/actions/file_manipulation.rb:116:in `block in template'              
thor-0.19.1/lib/thor/actions/create_file.rb:53:in `call' 
thor-0.19.1/lib/thor/actions/create_file.rb:53:in `render'                                
thor-0.19.1/lib/thor/actions/create_file.rb:46:in `identical?'                            
thor-0.19.1/lib/thor/actions/create_file.rb:72:in `on_conflict_behavior'                  
thor-0.19.1/lib/thor/actions/empty_directory.rb:113:in `invoke_with_conflict_check'       
thor-0.19.1/lib/thor/actions/create_file.rb:60:in `invoke!'
thor-0.19.1/lib/thor/actions.rb:94:in `action'
thor-0.19.1/lib/thor/actions/create_file.rb:25:in `create_file'                           
thor-0.19.1/lib/thor/actions/file_manipulation.rb:115:in `template'                       
railties-4.2.5.2/lib/rails/generators/named_base.rb:26:in `block in template'  
jpgeek commented 8 years ago

Digging deeper, this looks like a regression. It looks like it broke starting with ruby 1.9.1 when IO.binread was introduced.

# lib/thor/core_ext/io_binary_read.rb

class IO #:nodoc:
  class << self                                                                           
    def binread(file, *args)
      fail ArgumentError, "wrong number of arguments (#{1 + args.size} for 1..3)" unless args.size < 3
      File.open(file, "rb") do |f|                                                        
        f.read(*args)
      end
    end unless method_defined? :binread
  end
end     

This binread() results in UTF-8 encoding (if args are not specified). However, IO.binread from Ruby 1.9.2 and later defines it as ASCII-8BIT:

static VALUE
rb_io_s_binread(int argc, VALUE *argv, VALUE io)
{
    VALUE offset;
    struct foreach_arg arg;

    rb_scan_args(argc, argv, "12", NULL, NULL, &offset);
    FilePathValue(argv[0]);
    arg.io = rb_io_open(argv[0], rb_str_new_cstr("rb:ASCII-8BIT"), Qnil, Qnil);
    if (NIL_P(arg.io)) return Qnil;
    arg.argv = argv+1;
    arg.argc = (argc > 1) ? 1 : 0;
    if (!NIL_P(offset)) {
        rb_io_seek(arg.io, offset, SEEK_SET);
    }
    return rb_ensure(io_s_read, (VALUE)&arg, rb_io_close, arg.io);
}

@sferik I am happy to write a test and patch for it, but I want to make sure I am not missing something and that there is a chance it will get pulled.