patsplat / plist

All-purpose Property List manipulation library
http://www.rubydoc.info/gems/plist
MIT License
306 stars 71 forks source link

UTF-8 error #21

Closed Larzo closed 7 years ago

Larzo commented 10 years ago

I got this error: C:/Ruby193/lib/ruby/gems/1.9.1/gems/plist-3.1.0/lib/plist/parser.rb:93:in `scan' : incompatible encoding regexp match (UTF-8 regexp with IBM437 string) (Encoding ::CompatibilityError)

If I add this to the StreamParser::parse() method it is ok: @xml = @xml.force_encoding("UTF-8")

This happened trying to access my podcasts in the library

mmmries commented 10 years ago

I had a similar error where the call to File.read is returning a string that is US-ASCII encoded, but the xml is actually UTF-8 encoded.

ANTARESXXI commented 9 years ago

Hi!

I have a problem with Plist::parse_xml when parse binary plist (plist: https://www.dropbox.com/s/so8h9nk0x3pemwq/Info.plist?dl=0)

/Library/Ruby/Gems/2.0.0/gems/plist-3.1.0/lib/plist/parser.rb:91:in `scan': invalid byte sequence in UTF-8 (ArgumentError)

jmej commented 8 years ago

I get the same issue parsing plist XML from an external API: "exception": Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)

Shpigford commented 8 years ago

Same issue here when trying to parse a file stored in an AWS S3 bucket.

Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)

The same file works fine locally.

Shpigford commented 8 years ago

Just figured out the workaround. If you read the file and then force_encoding it seems to work well.

Plist::parse_xml(open(file).read.force_encoding 'utf-8')

mattbrictson commented 7 years ago

Is anyone still running into this issue? I recently joined the project as its maintainer and am willing to help find a fix. The first step would be to come up with a reproducible test case.

reitermarkus commented 7 years ago

@mattbrictson, yes, we just ran into this. Here's the relevant comment: https://github.com/caskroom/homebrew-cask/issues/32537#issuecomment-295424183

mattbrictson commented 7 years ago

@reitermarkus Thanks for the info! Do you think this is an issue with a particular plist file, or is more of a Ruby environment problem? If it is a certain plist that causes the crash, could you share it via a Gist?

reitermarkus commented 7 years ago

@mattbrictson, I'm not sure. It only happens with plist files that have special characters (which are valid UTF-8, though).

Here's one example: https://gist.github.com/reitermarkus/4009defc6ab720af5d49b35d558cdedc#file-de-monospc-lightkey-pkg-app-export-plist-plist-L8726

You cannot see it, but there is the character \u0099 after the word Circus.