square / fdoc

Documentation format and verification
Other
380 stars 59 forks source link

Fix encoding problem #49

Closed lazywei closed 10 years ago

lazywei commented 10 years ago

Hi

We encounter some encoding exception when we try to generate .fdoc file and convert it into .html.

Therefore, I add the following changes:

The above two changes fix the conversions error Encoding::UndefinedConversionError: "\xE4" from ASCII-8BIT to UTF-8. For more detail, please ref this SO answer

In addition, I replace JSON by Oj to fix the conversion error which occurs when I try to convert .fdoc into .html. Moreover, this gem is faster than the standard JSON library. Please also ref this SO answer.

This pull request is related to #27 and should fix that problem.

Thanks

lazywei commented 10 years ago

You can reproduce the problem via this yaml file dishes-POST.fdoc

photo = YAML.load_file("dishes-POST.fdoc")
JSON.dump photo

The above code will raise Encoding::UndefinedConversionError: "\xEF" from ASCII-8BIT to UTF-8

However, use Oj will make things right

photo = YAML.load_file("dishes-POST.fdoc")
Oj.dump photo
zachmargolis commented 10 years ago

Thanks for the repro steps!

How strongly do you feel about Oj vs regular JSON? I'd prefer to maintain JRuby compatibility, and by default JRuby is not built with extension support, which means installing Oj fails:

margymargs:~ margolis$ rvm use jruby Using /Users/margolis/.rvm/gems/jruby-1.7.4 margymargs:~ margolis$ gem install oj Fetching: oj-2.1.4.gem (100%) Building native extensions. This could take a while... ERROR: Error installing oj: ERROR: Failed to build gem native extension.

    /Users/margolis/.rvm/rubies/jruby-1.7.4/bin/jruby extconf.rb
lazywei commented 10 years ago

Thanks for your reply.

Sorry for that, I haven't thought about the JRuby compatibility... I use Oj just because it solve the encoding problem automatically and it is fast. I don't think we must use it. Therefore, if we need to maintain JRuby compatibility, I think I can use such workaround on that:

def deeply_to_utf8(obj)
  if obj.is_a?(Hash)
    clean_hsh = {}
    obj.each do |k, v|
      clean_hsh[k] = deeply_to_utf8(v)
    end
    clean_hsh
  elsif obj.is_a?(Array)
    obj.map {|x| deeply_to_utf8(x) }
  elsif obj.is_a?(String)
    obj.encode('UTF-8', {:invalid => :replace, :undef => :replace, :replace => '?'})
  else
    obj
  end
end

r = YAML.load_file("dishes-POST.fdoc")
puts deeply_to_utf8(r).to_json

If you think this is doable, I will commit the code into this branch and remove Oj.

lazywei commented 10 years ago

Just updated the indention and the encoding comments!

lazywei commented 10 years ago

The original reason that the UndefinedConversionError shows up is because I use

fixture_file_upload("/logo.png", "image/png")

to test file uploading.

This will create a Rack::Test::UploadedFile and pass it as parameters.

However, this Rack::Test::UploadedFile is somehow can not be convert into yaml successfully.

Therefore, I need some other workaround to fix this problem. Maybe it is this workaround that causes the UndefinedConversionError.

If we don't want to use this workaround ( Oj, encoding: utf-8 ...), then we need to fix this problem thoroughly.

The problem then becomes "why Rack::Test::UploadedFile can not be convert into yaml."

After some searching, I found this issue of psych, which is the default yaml engine used by ruby.

It says that psych can not deal with Delagator well, and that's why Rack::Test::UploadedFile can not be convert into yaml. Fortunately, there is already a pull request to fix this problem. By using that pull request, I don't need to use Oj anymore, which means Fdoc doesn't need Oj as dependency.

So.. would you like to use the current pull request opened by me or use psych's pull request to fix the problem (which means to add that psych as dependency)?

Sorry for my poor english, if there is any confusion please let me know. Thanks!

zachmargolis commented 10 years ago

Let's rely on the Psych pull request to fix this problem, it seems like an extreme case to change this much code for.

Thanks!