Closed steven-joruk closed 3 months ago
Item 1 Is the biggest issue - a well-formed UTF8 XML document can start with a BOM which we must support, and ideally we would also support XML plists that have whitespace before the leading <
character.
Can the first character of a reasonable ASCII plist file be a <
?
Item 1 Is the biggest issue - a well-formed UTF8 XML document can start with a BOM which we must support, and ideally we would also support XML plists that have whitespace before the leading
<
character.
I agree, I've pushed a fix. If there's any unicode byte order mark or if the first non-whitespace string is "<?xml" then it's considered XML.
This continues from #44
Some comments brought over from there:
<?xml
, with no preceding whitespace, will be treated as ascii, which might not be desirable.~swap_remove
.~\
(\\
) because the test that parses netnewswire.pbxproj fails without it.The fuzzer quickly found an infinite loop in handling block comments. I let it run for another 10 hours, it tried 510 million inputs without finding anything else.
The related issue (#42) contains a suggestion that it should be renamed to
OpenStepReader
or similar. I don't know the full history of the format (wikipedia discusses it here). If I'm understanding it correctly then NextStep read integers as strings, OpenStep supported integers and real numbers, GNUStep supported NSValue and NSDate. This is missing support for floats.