WyriHaximus / HtmlCompress

MIT License
78 stars 17 forks source link

HTML Compress invalid #70

Open Muetze42 opened 5 years ago

Muetze42 commented 5 years ago

The HTML compressor generates non-existent closing input tags.

The original code has no errors and warnings according to W3 and the compressed version contains the error with .

undcompressed.txt compressed.txt

WyriHaximus commented 5 years ago

It's probably doing that because the input isn't closed with />. /cc @voku

Muetze42 commented 5 years ago

Even with a closed day, it happens. Furthermore, you should close this tag in HTML5 even if it is not provided in the W3C standard for HTML5?

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input

WyriHaximus commented 5 years ago

Good question, tbh I have no clue. Outsourced the actual HTML compressing to @voku's https://github.com/voku/HtmlMin and just orchestrate sending the HTML there, the CSS and JS to other packages for compressing. I'll add an edge case soon to ensure this is handled

voku commented 5 years ago

@WyriHaximus the bug was in a regex that could be re-written to non regex :) (fixed version is 4.0.6)

"Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems." :+1:

Muetze42 commented 5 years ago

Ah. Before you update, I have one more.... In

 
 is written in. These are not needed (at least I don't see any effect when I remove them) and are also displayed as errors from the W3C validator.

voku commented 5 years ago

@MuetzeOfficial can you reformat your "one more" example?

Muetze42 commented 5 years ago

W3C Validator: Error: A numeric character reference expanded to carriage return. 
 The ";" is marked....

I removed the " " in my version without a optical problem....

uncompressed.txt compressed.txt

Muetze42 commented 5 years ago

I edit the post. With the code function, You can see it....

voku commented 5 years ago

I think the carriage return is correct there. - reference e.g. https://stackoverflow.com/posts/20528324/revisions

... but you can easily use \n instead of \r\n with e.g. \str_replace("\r\n", "\n", $html) before minifying.

Muetze42 commented 5 years ago

Currently I simply removed </input> & &#13; at the end before output. Now only the </input> until the update.^^