earwig / mwparserfromhell

A Python parser for MediaWiki wikicode
https://mwparserfromhell.readthedocs.io/
MIT License
758 stars 75 forks source link

Wikicode string methods do not completely correspond to their string string equivalents #153

Open ghost opened 8 years ago

ghost commented 8 years ago
>>> w
'[[link|text]]'
>>> w.text
'text'
>>> w.text.replace("t", '')
>>> w.text
'ex'
>>> w.text.replace("doesntexist", '')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/andwang/.local/lib/python3.4/site-packages/mwparserfromhell/wikicode.py", line 371, in replace
    for exact, context, index in self._do_weak_search(obj, recursive):
  File "/home/andwang/.local/lib/python3.4/site-packages/mwparserfromhell/wikicode.py", line 160, in _do_weak_search
    raise ValueError(obj)
ValueError: doesntexist

please fix

ghost commented 8 years ago

Mostly the part where trying to replace something that is not contained in the string results in an exception instead of the original string unchanged:

>>> "aoeui".replace("j", '')
'aoeui'
lahwaacz commented 8 years ago

Your variable w is not a string (str), but a Wikicode object. Its replace method (documented here) works differently than str.replace, the ValueError exception is raised regardless of the parameter type, which can be a Node, Wikicode or str. Given str is first parsed into another Wikicode object, hence the exception for consistency.

If you want to do str-like replacement and get a str instead of Wikicode, do str(w).replace("doesntexist", "").

earwig commented 8 years ago

You're right, though I see the point. There's a benefit in having the API match the regular string API as much as possible, though I'm uncomfortable making a breaking change here. I'll think about it, maybe along with the larger changes for v1.0...

lahwaacz commented 8 years ago

Note that str.replace returns a copy of the string instead of changing the instance. Wikicode.replace will most likely never do this, so there will always be some difference.