Closed Larzo closed 7 years ago
I had a similar error where the call to File.read is returning a string that is US-ASCII encoded, but the xml is actually UTF-8 encoded.
Hi!
I have a problem with Plist::parse_xml when parse binary plist (plist: https://www.dropbox.com/s/so8h9nk0x3pemwq/Info.plist?dl=0)
/Library/Ruby/Gems/2.0.0/gems/plist-3.1.0/lib/plist/parser.rb:91:in `scan': invalid byte sequence in UTF-8 (ArgumentError)
I get the same issue parsing plist XML from an external API: "exception": Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
Same issue here when trying to parse a file stored in an AWS S3 bucket.
Encoding::CompatibilityError: incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
The same file works fine locally.
Just figured out the workaround. If you read
the file and then force_encoding
it seems to work well.
Plist::parse_xml(open(file).read.force_encoding 'utf-8')
Is anyone still running into this issue? I recently joined the project as its maintainer and am willing to help find a fix. The first step would be to come up with a reproducible test case.
@mattbrictson, yes, we just ran into this. Here's the relevant comment: https://github.com/caskroom/homebrew-cask/issues/32537#issuecomment-295424183
@reitermarkus Thanks for the info! Do you think this is an issue with a particular plist file, or is more of a Ruby environment problem? If it is a certain plist that causes the crash, could you share it via a Gist?
@mattbrictson, I'm not sure. It only happens with plist files that have special characters (which are valid UTF-8, though).
Here's one example: https://gist.github.com/reitermarkus/4009defc6ab720af5d49b35d558cdedc#file-de-monospc-lightkey-pkg-app-export-plist-plist-L8726
You cannot see it, but there is the character \u0099
after the word Circus
.
I got this error: C:/Ruby193/lib/ruby/gems/1.9.1/gems/plist-3.1.0/lib/plist/parser.rb:93:in `scan' : incompatible encoding regexp match (UTF-8 regexp with IBM437 string) (Encoding ::CompatibilityError)
If I add this to the StreamParser::parse() method it is ok: @xml = @xml.force_encoding("UTF-8")
This happened trying to access my podcasts in the library