flexera-public / right_aws

RightScale Amazon Web Services Ruby Gems
MIT License
451 stars 175 forks source link

#<REXML::ParseException: #<Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string) #94

Open tomjoro opened 12 years ago

tomjoro commented 12 years ago

Hi, had this issue with 2.1.0 and S3 when using european characters (São Paulo, for example) which I was able to fix by forcing the encoding in RightAWSParser.

A message with the S3 name was passed via SQS.

Wondering if I did something wrong? Was there someway to set the encoding or is this an issue?

I fixed the problem by forcing the encoding by monkey patching RightAwsParser.

module RightAws class RightAWSParser def parse(xml_text, params={}) ... xml_text.force_encoding 'utf-8' # This is the magic line... REXML::Document.parse_stream(xml_text, self)

This was the error :

/Users/tom/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/rexml/source.rb:212:in match' /Users/tom/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/rexml/source.rb:212:inmatch' /Users/tom/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/rexml/parsers/baseparser.rb:425:in pull' /Users/tom/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/rexml/parsers/streamparser.rb:16:inparse' /Users/tom/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/rexml/document.rb:204:in parse_stream' /Users/tom/.rvm/gems/ruby-1.9.2-p290@dashwire/gems/right_aws-2.1.0/lib/awsbase/right_awsbase.rb:1098:inparse' /Users/tom/.rvm/gems/ruby-1.9.2-p290@dashwire/gems/right_aws-2.1.0/lib/awsbase/right_awsbase.rb:536:in block in request_info_impl' /Users/tom/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/benchmark.rb:295:inmeasure' /Users/tom/.rvm/gems/ruby-1.9.2-p290@dashwire/gems/right_aws-2.1.0/lib/awsbase/benchmark_fix.rb:30:in add!' /Users/tom/.rvm/gems/ruby-1.9.2-p290@dashwire/gems/right_aws-2.1.0/lib/awsbase/right_awsbase.rb:536:inrequest_info_impl' /Users/tom/.rvm/gems/ruby-1.9.2-p290@dashwire/gems/right_aws-2.1.0/lib/sqs/right_sqs_gen2_interface.rb:142:in request_info' /Users/tom/.rvm/gems/ruby-1.9.2-p290@dashwire/gems/right_aws-2.1.0/lib/sqs/right_sqs_gen2_interface.rb:289:inreceive_message' /Users/tom/.rvm/gems/ruby-1.9.2-p290@dashwire/gems/right_aws-2.1.0/lib/sqs/right_sqs_gen2.rb:170:in receive_messages' /Users/tom/.rvm/gems/ruby-1.9.2-p290@dashwire/gems/right_aws-2.1.0/lib/sqs/right_sqs_gen2.rb:185:inreceive' /Users/tom/git/sqs_util/lib/sqs_util/base.rb:75:in `get_one_message'

tomjoro commented 12 years ago

I used the send_message method to generate the SQS message. I noticed the header of the message like this: <?xml version="1.0"?>

maybe it should be this in the header... <?xml version="1.0" encoding="utf-8" ?>

Maybe I just can't find the setting...??

kagminjeong commented 12 years ago

Default encoding in xml is utf-8. http://www.opentag.com/xfaq_enc.htm#enc_default

My guess is the text should be explicitly decoded as utf-8 (?? - I don't use ruby 1.9).

It is possible also that the responsible party is rexml - if it received a binary stream as input it should attempt to figure out the encoding and then decode the binary stream appropriately.