Closed kno10 closed 7 years ago
Here is a unit test fragment:
These are some difficult bad syntax examples from Wikipedia:
== First ==
[[User:User|User]]'''[[User talk:User|User''']]
== Second ==
<span style="font-face: bold">... [[User:User|<b>User</b>]]'''</span>
== Third ==
<b style="color:red;">¹''' - yes, this is real, people write such markup.
Note that in the bottom one, Wikipedia appears to allow <b>bold'''' non-bold
, but Sweble would interpret this als <b>bold'''double bold'''</b>
A beautiful collection of horrible markup :)
I'll see what I can do about it.
Test above unit test fragment does not cause an internal error when I test. Are those two separate issues reported in one issue?
Yes, apparently the InternalError is caused by something else.
I expect my commit e8b3562b8159cee731efa07951f8f8b6899a75ca to solve the exception (but I can't tell yet if it helps - it did involve <b>
XML nodes when non-XML bold was expected). It has been running for 50 minutes, without an error yet.
Above markup causes some interesting errors (in particular a stray </#int-link>
), so I shared it as-is, even though it apparently is not enough to trigger the bug.
You can try the full Wikipedia article https://en.wikipedia.org/w/index.php?title=John_Elway&action=edit to reproduce the bug. Maybe it needs to be in a table to trigger the original bug.
I stumbled over that strange </#int-link> as well. Curious what I did there...
Fixed in version 2.2.0
In
John Elway
:This page uses (see table "regular season") the rather crazy syntax
<b style="color:red;">foo'''
, maybe this is causing the problem? It seems to be fixed after "patching" this article.also in
Wikipedia:Naming policy poll
:here, this is probably caused by this fragment:
[[User:RickK|Rick]]'''[[User talk:RickK|K''']]
. This appears to be a case of "propagatable inline formatting" as discussed in your paper, but does not seem to be correctly applied (maybe because there is no italic text in front of the nested element?) or it is the unclosed bold in<span>... [[User:KuwarOnline|<b>KuwarOnline</b>]]'''</span>
on that page (removing both fixed it, the first wasn't enough).