Closed kristian-clausal closed 1 month ago
<includeonly>
test
test
[[Category:Transitive verbs]]
[[Category:Intransitive verbs]]</includeonly>
results in "\n\n\ntest\n\ntest", but
<includeonly>test
[[Category:Transitive verbs]]
[[Category:Intransitive verbs]]</includeonly>
results in "test", with no trailing newlines, or rather, no trailing whitespace.
Here are the rules, I think:
</includeonly>
, render it.I'm not sure this is documented anywhere.
A wrinkle: we remove includeonly
tags when adding the page to the database. Just rip them out.
I think MediaWiki doesn't change new lines in <includeonly>
. I have tested your examples in sandbox page and the Special:ExpandTemplates page, new lines in <includeonly>
are not removed. We should get the same expanded wikitext in the "Result" section in "Special:ExpandTemplates" page.
--<includeonly>
test
test
[[Category:Transitive verbs]]
[[Category:Intransitive verbs]]</includeonly>--
expands to:
--
test
test
[[Category:Transitive verbs]]
[[Category:Intransitive verbs]]--
and
--<includeonly>test
[[Category:Transitive verbs]]
[[Category:Intransitive verbs]]</includeonly>--
expands to:
--test
[[Category:Transitive verbs]]
[[Category:Intransitive verbs]]--
I guess you only see the preview HTML? IMO that's not what we should imitate, at least not before and during expanding, it should happen at when we convert wikitext to plain text. This also happens when <includeonly>
is not used.
For simple edition extractor code, maybe use str.replace("\n", "")
at the moment? We're supposed to process the tag template separately from gloss text anyway.
Currently, some templates generate newlines inside things like glosses. This is not acceptable:
Simple English Wiktionary:
# {{ti verb}} Foo
-> (transitive)\n Foo
This should be (transitive) Foo
, like on the webpage.
I will ignore what the Expand templates page says, because there is something fucky going on. At some point, the contents of the includeonly
gets rstrip()ped or whatever the PHP equivalent is, probably after the category links have been expanded. But currently we're not handling includeonly at all, just removing the tags!
If you can't come up with a better solution, I will merge this. The results is what matters, because our implementation does definitely not follow the wikitext implementation; at best we're approximating it. This IS just a hack, but it's better than nothing.
I'm not sure if you notice the new lines also removed when <includeonly>
is not used, I think this conversion happens at the process when MediaWiki converts wikitext to HTML, this is not related to how <includeonly>
is handled.
Same for our code, I think this is same as how we remove category link from expanded wikitext, I think MediaWiki at this step also removes new lines around these links.
You are correct! Damn it. I thought it was the includeonly
.
Simple English Wiktionary -> Template:ti verb -> edit and use "Preview page with this template" with "excrete"
(''[[transitive|<span style="color:green">transitive</span>]] & [[intransitive|<span style="color:green">intransitive</span>]]'')
[[Category:Transitive verbs]]
[[Category:Intransitive verbs]]t
-> 1. ([biology](https://simple.wiktionary.org/wiki/biology)) ([transitive](https://simple.wiktionary.org/wiki/transitive) & [intransitive](https://simple.wiktionary.org/wiki/intransitive))t If your body excretes waste material,
but
(''[[transitive|<span style="color:green">transitive</span>]] & [[intransitive|<span style="color:green">intransitive</span>]]'')
[[Category:Transitive verbs]]
[[Category:Intransitive verbs]]
t
-> 1. ([biology](https://simple.wiktionary.org/wiki/biology)) ([transitive](https://simple.wiktionary.org/wiki/transitive) & [intransitive](https://simple.wiktionary.org/wiki/intransitive)) t If your body excretes waste material,
The newlines and white space is removed before? the Category links, doesn't seem to have to do with trimming the end of the expanded value.
Maybe we should ask a MediaWiki developer what's the rules of convert newlines around category links to HTML if this is not documented...
In the meantime, I think we could temporary get around this problem if we could extract the link nodes in expanded tag template and save its category links by calling clean_node()
.
Instead of fixing this here, PR 843 for wiktextract solves the problem on wiktextract's side, in clean_value
.
It's a bit weird that there's no really convenient place to fix this on wikitextprocessor's side (we do have all the to_X
functions in Wtp), and it might be better to have clean_value's and clean_node's functionality on the Wikitextprocessor side. It's a bit weird now.
For example: Template:ti verb
is used on one page excrete. The
includeonly
element has a newline, which is rendered on our side as'glosses': ['(transitive & intransitive)\n Foooo.']
. TheTemplate:biology
on the same page doesn't get the newline, because there is no newline inside theincludeonly
.I mixed up
onlyinclude
andincludeonly
for a while and it took me a while to understand which is which...onlyinclude
is text that is the only thing you want to output when the template is expanded (anything outside of it is discarded).includeonly
, which is what is at issue here, is a piece of text like a Category link that you don't want to appear on the template's own display page. That is, you don't wantTemplate:ti verb
to be appear in theTransitive verbs
category, soincludeonly
will only let the category link be rendered when the template is being expanded on some other page.However, I don't understand why the newline disappears in this case. Either this is so common in wikitext that they just went ahead and removed all newlines in onlyinclude, or something else weird.
Has a newline after "transitive)" on Wiktionary
Does not have a newline after "transitive)"Wiktionary