cgross / grunt-dom-munger

Grunt task to read and manipulate HTML with CSS selectors.
MIT License
93 stars 40 forks source link

Adhere to Cheerio's XML mode when xmlMode option is specified #22

Open inta opened 10 years ago

inta commented 10 years ago

Just the small addition of xmlMode/$.xml as mentioned in issue 4.

cgross commented 10 years ago

Thanks for the pull request. There's alot more that would be needed here until a release can be made. There's obviously documentation (otherwise nobody but you, I, and anyone who reads this PR would know about it). Also, at least one unit test.

Could you make those changes?

inta commented 10 years ago

Yea, I can write some doc.

What do you expect the test to cover? Just one test with a XHTML/XML file (to ensure the output is correct XML), or running all existing tests again with xmlMode enabled?

cgross commented 10 years ago

One test with an XHTML/XML file would be fine. Just please make sure it has the tags that cheerio was previously changing (like meta and br). Thanks!

inta commented 10 years ago

Humph, seems not to be that easy. The xml function will rewrite empty elements to self closing ones. That is valid XML, but not XHTML. E.g. <script src="…"></script> will end up as <script src="…"/> which is useless.

cgross commented 10 years ago

Isn't that still valid XHTML?

inta commented 10 years ago

That is not valid XHTML 1 and definitely not valid for polyglot documents either. I do not really know about XHTML5, but polyglot problem does apply here as well.

I think that cannot be fixed here. I do not know if Cheerio or htmlparser2 do offer anything to handle XHTML.