3breadt / dd-plist

A java library providing support for ASCII, XML and binary property lists.
Other
258 stars 94 forks source link

PropertyListParser.determineType can only detect UTF-8 BOM #66

Closed jpstotz closed 2 years ago

jpstotz commented 3 years ago

The two methods com.dd.plist.PropertyListParser.determineType(InputStream, int) and com.dd.plist.PropertyListParser.determineType(byte[]) seem to be hard-coded to the UTF-8 BOM EF BB BF.

However the parser implementation of ASCIIPropertyListParser is for example also able to handle UTF-16 and UTF-32 files, but you will never get to that point if you try to read an UTF-16 ASCII file using one of the com.dd.plist.PropertyListParser.parse(..) method as because of the BOM the determineType(String) will not work correctly so that you end up with a PropertyListFormatException.

3breadt commented 3 years ago

An ASCII property list may actually only contain ASCII characters, all other characters are supposed to be encoded using escape sequences.

But this is a bug that would affect the XML property list format, so it needs to be addressed.

josh198181 commented 2 years ago

<!ENTITY % plistObject "(array | data | date | dict | real | integer | string | true | false )" > <!ELEMENT plist %plistObject;> <!ATTLIST plist version CDATA "1.0" >

<!ELEMENT array (%plistObject;)> <!ELEMENT dict (key, %plistObject;)> <!ELEMENT key (#PCDATA)>

<!ELEMENT string (#PCDATA)> <!ELEMENT data (#PCDATA)> <!ELEMENT date (#PCDATA)>

<!ELEMENT true EMPTY> <!ELEMENT false EMPTY> <!ELEMENT real (#PCDATA)> <!ELEMENT integer (#PCDATA)>

3breadt commented 2 years ago

Finally got around to work on this issue and fix it. A bugfix release is coming up.