Swaagie / minimize

Minimize HTML
MIT License
162 stars 18 forks source link

self-closing tags #23

Open HarryBurns opened 9 years ago

HarryBurns commented 9 years ago

Hi!

I noted that minimize(v. 0.9.0) incorrectly processes the self-closing tags. Before:

<div />
<h1>Hello world</h1>

After:

<div><h1>Hello world</h1></div>

It is not correct to use self-closing div, but it's just an example. I use angular, and there is same issue for any custom tag.

Swaagie commented 9 years ago

The underlying issue here is that the way minimize outputs the processed HTML again. Hopefully htmlparser2 shows if an element is self closing, otherwise this is a hard fix

Not closing a tag based on the premise of an element not being regular HTML is not possible as that would create ambiguous output.

Swaagie commented 9 years ago

Just checked it, there is no flag on the parsed elements that indicates whether an element is inline or not. You could try to open an issue at https://github.com/fb55/htmlparser2, but frankly I think this is beyond the scope of both projects. As the aim is to parse/minify HTML and not custom elements implemented by a higher level module/library.

A possible solution would be to create a custom angular parser that is derived from the https://github.com/fb55/domhandler, as that is what htmlparser2 (and thus minimize) use by default. It wouldn't mind to build in optional support for custom handlers and special flags. I do not however have the time to write a custom angular parser (haven't checked if that exists already) myself, so community input would be welcome here.

jlas commented 9 years ago

This is not only an issue for custom tags, but some official HTML tags are also "self closing" (see https://developer.mozilla.org/en-US/docs/Glossary/empty_element)

Swaagie commented 9 years ago

@jlas as far as I'm aware these are correctly parsed and outputted (code reference), this issue was specifically about custom elements that are not part of the inline list. However, if you have encountered a specific problem I'd like to known.

Also since this issue got revisited, this could potential be solved with a custom plugin, since minimize supports plugins now.

jlas commented 9 years ago

@Swaagie here is an example I was having trouble with:

var M = require('minimize');
var snippet = ('<button><input type="hidden"/></button>');
m = new M();
m.parse(snippet, function (err, data) { console.log(data); });

This outputs:

<button></button><input type=hidden>

Now, as far as I know, <button> should allow <input> as a child element, according to MDN. But it seems like htmlparser2 is not allowing that (I think here)...

Anyway I created a PR (#43) which allows passing options to htmlparser2. In my case passing the xmlMode: true option gets my example above to work. Either way feel free to reject it as I have found another fix, which is to enclose the <input> in a <div> instead of <button>...