Open chrispy-snps opened 1 month ago
This is a more specific follow-up to #182.
When the < escape sequence is processed, it is incorrectly converted to < instead of kept as-is:
<
<
>>> import minify_html >>> print(minify_html_onepass.minify("<")) < >>> print(minify_html_onepass.minify("<faketag")) <faketag >>> print(minify_html_onepass.minify("<faketag>")) <faketag>
Strangely, a bare < by itself is processed correctly. It is only when followed by content that it breaks.
The issue occurs in both minify_html and minify_html_onepass.
minify_html
minify_html_onepass
We are able to work around it as follows:
html = html.replace("<", "AMP_LT_WORKAROUND") html_minified = minify_html.minify(html) html = html.replace("AMP_LT_WORKAROUND", "<")
but a proper fix would be better (and more efficient, as we process tens of thousands of HTML files at a time).
Hi @chrispy-snps, thank you for workaround
This is a more specific follow-up to #182.
When the
<
escape sequence is processed, it is incorrectly converted to<
instead of kept as-is:Strangely, a bare
<
by itself is processed correctly. It is only when followed by content that it breaks.The issue occurs in both
minify_html
andminify_html_onepass
.We are able to work around it as follows:
but a proper fix would be better (and more efficient, as we process tens of thousands of HTML files at a time).