Open desb42 opened 4 years ago
Thanks as always for the detail. If it helps any, screenshots aren't necessary as your breakdown is more than helpful (just trying to save you any possible work).
How slavish should xowa be to incorrect wikitext?
Yeah, XOWA tries to handle incorrect wikitext, but the XOWA parser is brittle, especially around templates, but also with XML nodes. I haven't looked at the code, but in this case, I'm guessing XOWA gives priority to closing XML tags (pulling the </big>
tag) before trying any corrective action. The XML priority is needed to handle "extension" tags like <ref>
, <poem>
, etc. which are like their own "mini-DOM"
My inclination is to go edit the wikitext on the mediawiki site
If it's a one-off, then that's probably best. If you're seeing this often (like it's generated by a Template / Module), then I'll look at a longer-term fix
Also, just sharing some other background
I'm working on version 3 of the XOWA parser
I'm not sure how Version 3 will turn out as it's ambitious in scope (one-step transpiling of MediaWiki PHP code to Java). I've manually transpiled enough PHP code in Version 2, that I think this is feasible, but it could be a very deep rabbit-hole. I'll know by the end of this month what a possible timeline is.
I'm bringing this up b/c at this point you're actually an (if not "the") authority on all the bugs in the Version 1 parser. Depending on the bug's severity, please feel free to bump / nudge, and I'll prioritize.
My prioritization order has been:
Feel free to add other guidelines above, or just let me know if there's a specific issue that needs fixing.
Thanks!
looking at
de.wikisource.org/wiki/Vorlage:Hauptseite_Box_Aktuell
(which is part of the Main page in dewikisource (data from 2020-05-01) gives: Note the double square brackets [[ .. ]]This ought to have been processed In case this changes the actual wikitext is
This is actually, syntactically, incorrect If this is 'corrected' by moving the
'''</big>
to the other side of the]]
, the anticipated behaviour occurs. How slavish should xowa be to incorrect wikitext? My inclination is to go edit the wikitext on the mediawiki site