SublimeText / PackageDev

Tools to ease the creation of snippets, syntax definitions, etc. for Sublime Text.
MIT License
436 stars 83 forks source link

Converting to YAML from PLIST error on comment or & sign #86

Closed aLagoG closed 7 years ago

aLagoG commented 7 years ago

Well the title pretty much says it all but I'm trying to convert a .tmLanguage file to YAML and it fails at first commented line: Error parsing Property List "'C:\Users\<username>\AppData\Roaming\Sublime Text 3\Packages\User\C++.tmLanguage": not well-formed (invalid token), line 5, column 6

The file in that line has the following: <!--------- File Type Support -----------> After removing that problem it breaks on a regex that contains the & sign

Does anyone know what is causing that problem?

FichteFoll commented 7 years ago

Please share the file so I can try to reproduce it.

aLagoG commented 7 years ago

So, apparently the regex part is my fault, xml doesn't like the & symbol by itself But the comment part still stands Here is the file: (had to zip it for github to allow it) C++.zip

Btw, thank you for answering so fast

FichteFoll commented 7 years ago

First off, Property Lists are XML.

So, the comment issue is a direct result of technically incorrect usage of hyphens within comments in XML. See also https://www.w3.org/TR/REC-xml/#sec-comments. Basically, XML comments must end on the first -- sequence encountered after <!-- (with -->). Thus, the plist/xml parser fails and reports an error. The only reason why this would work in ST is because it uses a less strict parser.

The second issue is due to another aspect of XML, namely its entities. If you need to write a < or a & character in any XML tag's content, you must either write these as &lt; or &amp;, like you do in HTML and similar, or you wrap the tag's content in <![CDATA[...]]>, where you specifically do not have to escape these. You'll need to take care of a potential ]]> sequence inside the string now, however.

After fixing these two issues in the source (by removing comments and properly escaping & characters), convertion is successful.


All in all, these are the result of a less strict Property List parser that ST uses compared to the one that is provided in Python's stdlib. I'm not going to work around any of these, because it is Doing The Correct Thing™ and going down the workaround road could lead to more and more issues that require workarounds. XML is an old and established standard and I don't see a reason to ignore it.