gnosygnu / xowa

xowa offline wiki application
Other
379 stars 40 forks source link

{{gallery}} and <ref> #493

Open desb42 opened 5 years ago

desb42 commented 5 years ago

I am building enwiki 2019-06-01 at the moment and was looking at some of the error output when I stumbled across this page en.wikipedia.org/wiki/Kay_Musical_Instrument_Company The wikitext of interest is quite a long way down the page (search for k573>) A snippet: (only showing first image of the gallery)

{{Gallery |width=95 | height=144
|File:Truetone Jazz King (Kay Speed Demon K573).jpg
|<!-- 86px -->Speed Demon K573<ref name=k573>
    {{cite web
     | title     = Kay/Silvertone: Speed Demon (K573) c. 1964
     | url       = http://www.vintagesilvertones.com/forsale_gtr-kay_speeddemon.html
     | publisher = VintageSilvertone.com
    }}
    </ref> / Truetone Jazz King (1960s)<ref name=truetone/><ref name=jazzking group=media>
    {{cite AV media
     | date      = 2009-11-02
     | title     = 1963 Truetone Jazz King Vintage Electric Guitar AKA Silvertone - Kay Speed Demon model K573
     | url       = https://www.youtube.com/watch?v=6q3L69hfOwo
     | medium    = video
    }}<br/>Note: Not yet found sources other than YouTube.
    </ref>
}}

The current processing (in Gallery_parser.java) assumes a single line So only the line

|<!-- 86px -->Speed Demon K573<ref name=k573>

is processed

Strictly, the inner sections should be processed first That is, if the \s where parsed first (or at least tokenised) this would appear as one line

gnosygnu commented 5 years ago

Thanks for the detail. I think this is going to be a hard problem.

If I remember correctly, MediaWiki uses StripState to ignore <ref> (and other xml nodes) during the first pass. XOWA does not which leads to odd cases when you have ref tags inside template expressions (anything inside {{{`` and}}}```).

I'll triage this a little more later, but it could be a while. Are you seeing a lot of these errors?