Open idella opened 12 years ago
According to html standard <!foo> should be treated as a "bogus comment"[1,2]. That was fixed in Python2.7 recently[3].
[1] http://www.w3.org/TR/html5/tokenization.html#markup-declaration-open-state [2] http://www.w3.org/TR/html5/tokenization.html#bogus-comment-state [3] http://bugs.python.org/issue13960
Testing of dev-python/mechanize-0.2.5 with CPython 2.7... .............................................................................................................................................................................................F..F.......................................................................ssssssssssssssssssssssssssssssssssss....................s...........................................................................................................................................................................................................................
FAIL: test_get_token (test.test_pullparser.PullParserTests)
Traceback (most recent call last): File "/mnt/gen2/TmpDir/portage/dev-python/mechanize-0.2.5/work/mechanize-0.2.5/test/test_pullparser.py", line 78, in test_get_token self._test_get_token(pc, tolerant) File "/mnt/gen2/TmpDir/portage/dev-python/mechanize-0.2.5/work/mechanize-0.2.5/test/test_pullparser.py", line 117, in _test_get_token self.assertEqual(p.get_token(), ("decl", "rheum", None)) AssertionError: Token('comment', 'rheum', None) != ('decl', 'rheum', None)
FAIL: test_tokens (test.test_pullparser.PullParserTests)
Traceback (most recent call last): File "/mnt/gen2/TmpDir/portage/dev-python/mechanize-0.2.5/work/mechanize-0.2.5/test/test_pullparser.py", line 274, in test_tokens self._test_tokens(pc, tolerant) File "/mnt/gen2/TmpDir/portage/dev-python/mechanize-0.2.5/work/mechanize-0.2.5/test/test_pullparser.py", line 290, in _test_tokens self.assertEquals(token.type, expected_token_types[i]) AssertionError: 'comment' != 'decl'
The very first line it's reading <!DOCTYPE and evaluating correctly to a 'decl'. It errors at p.get_token() is reading <!rheum> and evaluates it to 'comment' and not 'decl' the diff between the 2 is simply that the first char after <! is lowercase and it's not distinguishing it from <!--