Closed runar-rkmedia closed 6 years ago
Odd. minify
simply creates a Minifier
and feeds it a single bunch of input. Minifier.minify
does essentially the same thing, but first resets the internal state of the parser before running.
Can you get a copy of the HTML that causes the problem? Preferably trim it down to the minimum amount of content that still causes the problem to occur.
What I noticed so far:
1) Checking the HTML did not reveal missing tags.
2) One site is using htmlmin
directly, another site is using it via Flask-HTMLmin
. Both create a Minifier()
and make use of its minify()
function but only the Flask site produces this issue.
3) I tried feeding the identical HTML into minify()
and it did not reliably reproduce the error. It seems to be related to signing in and out. I am not using OAuth.
4) Signing in and out produces redirects, therefore I introduced a redirect into a simple route of the Flask app and it reliably reproduces the error.
I am trying to build a minimal example, but so far I have not been able to reproduce it.
Looking at the affected Flask app, I thought this would produce the error, but it does not:
from flask import Flask
from flask import redirect
from flask import url_for
from flask_htmlmin import HTMLMIN
app = Flask(__name__)
app.config['MINIFY_PAGE'] = True
HTMLMIN(app)
@app.route('/')
def hello_world():
return '<p> <a href="{}">Hello, World!</a> </p>'.format(
url_for('failure'))
@app.route('/failure/')
def failure():
return redirect(url_for('hello_world'))
I also got the problem when using Flask-HTMLmin, have not tried other libraries, or using this library directly. I will have a go at it when I get some time.
I tried printing the html
to console, and add the tag as part of the message while raising OpenTagNotFoundError
. Very strange result. The bad tag seems to be a p
-tag, but I have no p
-tags in the html at all. I tried with an almost empty html
-file, only <html><body></body></html>
and got no errors. I added a single div, like this <html><body><div></div></body></html>
and the error was consistently back.
I am unable to repro this problem with htmlmin directly:
import htmlmin
mini = htmlmin.Minifier()
mini.minify('<html><body><div></div></body></html>')
# u'<html><body><div></div></body></html>'
mini.minify('<html><body><div></div></body></html>')
# u'<html><body><div></div></body></html>'
mini.minify('<html><body><div></div></body></html>')
# u'<html><body><div></div></body></html>'
I am happy to work on this if you can get the problem repro'ed, but perhaps the problem is with Flask-HTMLMin?
The problem has nothing do with Flask-HTMLMin test.py:
from htmlmin import Minifier
default_options = {
'remove_comments': True,
'reduce_empty_attributes': True,
'remove_optional_attribute_quotes': False
}
html_minify = Minifier(
**default_options)
html1="""
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to target URL: <a href="http://localhost:8000/page/">http://localhost:8000/page/</a>. If not click the link.
"""
html2="""
<!DOCTYPE html>
<html>
<head>
<title>test...</title>
</head>
<body>
<p>test</p>
</body>
</html>
"""
html_minify.minify(html1)
html_minify.minify(html2) # take htmlmin.parser.OpenTagNotFoundError
html1 is returned when redirected: https://github.com/pallets/werkzeug/blob/4397e61daf66ad43bd5668741c5d876de116a71f/werkzeug/utils.py#L373
html2 landing page code
Thanks for a repro case! I will look at this soon.
@mankyd Thanks for fixing this issue! How about releasing a new version on PyPI?
Done
I ran into an odd issue here. Earlier, I used just the
minify(html)
-function, and it worked great. Lately, I've changed toMinifier.minify
, and ran into problems. On pages involving OAuth from Google, I getOpenTagNotFoundError
-errors from the parser. However, I tried removing almost all of the code that I sent to minification, but it still outputted an error. Switching back to only usingminify(html)
solves it.I have tried to debug this, but I am unable to figure it out. I tried sending only very basic html, of which I am certain is correct, and all tags are opened and closed in the correct order.
I see in the code that
minify
does not call anything in the parser, whichMinifier.minify
does. Is there a reason why there is a difference here?