sweble / sweble-wikitext

The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaWiki.
http://sweble.org/sites/swc-devel/develop-latest/tooling/sweble/sweble-wikitext
70 stars 27 forks source link

Too aggressive parsing of `-{ }-` Language Converter tags #48

Closed kno10 closed 7 years ago

kno10 commented 8 years ago

Language Converter tags appear to be a rather rare feature in English Wikipedia (as far as I can tell, it is not enabled at all). It took me a while to find any documentation on it (see https://www.mediawiki.org/wiki/Writing_systems/Syntax ) - this is probably mostly relevant for Chinese; although there will of course be people that would like to use -{en-us: neighbor; en-uk: neighbour}-

It does, however, lead to a number of bad parsing on English wikipedia, here is an example:

IUPACName=<small>4-(2-{4-[(11''R'')-3,10-dibromo-8-chloro-6,11-dihydro-5H-benzo[5,6]cyclohepta[1,2-''b'']pyridin-11-yl]piperidin-1-yl}-2-oxoethyl)piperidine-1-carboxamide</small>

from https://en.wikipedia.org/wiki/Lonafarnib

Note that this contains both [ ] but no link, and -{ }- but no lct. The proper way would likely be to add a <nowiki> tag in the source. But maybe the lct detection can be made less aggressive and emit a warning instead, or a simple way to disable LCT?

hannesd commented 8 years ago

I think I'll add a switch. If I remember correctly it's possible to use the -{ }- syntax as replacement for <nowiki>. Since almost anything can go into a nowiki element it's probably not possible to make the -{ }- syntax less aggressive.

hannesd commented 7 years ago

Fixed in version 2.2.0