earwig / mwparserfromhell

A Python parser for MediaWiki wikicode
https://mwparserfromhell.readthedocs.io/
MIT License
757 stars 75 forks source link

newline included in template name #261

Closed matkoniecz closed 3 years ago

matkoniecz commented 3 years ago
import mwparserfromhell

text = """
{{ValueDescription
|key=building
}}
"""

wikicode = mwparserfromhell.parse(text)
templates = wikicode.filter_templates()
template = templates[0]
print("<" + str(template.name) + ">")
print("<" + str(template.name.strip()) + ">")

outputs

<ValueDescription
>
<ValueDescription>

It has an easy workaround, but it would be nice to have it working immediately.

Though maybe there is reason why it is impossible?

I found https://github.com/earwig/mwparserfromhell/issues/14 that sounds like the same issue and fixed? So maybe this is also fixable?

I looked at https://github.com/earwig/mwparserfromhell/issues?q=template+newline+ https://github.com/earwig/mwparserfromhell/issues?q=template+whitespace and https://github.com/earwig/mwparserfromhell/search?q=template+whitespace and "whitespace", "newline" and "limitation" what found https://github.com/earwig/mwparserfromhell#limitations

matkoniecz commented 3 years ago

I see the same with template parameters.

lahwaacz commented 3 years ago

This is intentional - when you parse a wikitext don't change it, then str(wikicode) should produce the same result as the input string which was parsed.

legoktm commented 3 years ago

What @lahwaacz said. Those objects have a matches() function which takes care of whitespace and capitalization differences for you (as explained in the README).