WolfgangFahl / py-3rdparty-mediawiki

Wrapper for pywikibot and mwclient MediaWiki API librarties with improvements for 3rd party wikis
Apache License 2.0
4 stars 5 forks source link

Escaping of the query leads to different queries #55

Closed tholzheim closed 1 year ago

tholzheim commented 3 years ago

During the query escaping a blank is replaced with a underscore see: https://github.com/WolfgangFahl/py-3rdparty-mediawiki/blob/a7872b31fa41e65846b9092792dc292ac6a5fb7e/wikibot/smw.py#L206

For example query "[[isA::Event series]]" is converted to "[[isA::Event_series]]". On www.openresearch.org this results in two distinct results but on my own wiki not. A quick test showed that removing the mentioned line from above seems to resolve this issue (The query is then encoded by the library requests (" " to "+"))

Should this issue be fixed or should it stay dependent on the wiki configuration?

WolfgangFahl commented 3 years ago

Let's add a wikiquery functionality to proove the point fist. I all add an issue

WolfgangFahl commented 3 years ago
wikiquery -s or -q "[[isA::Event_series]]"
json
wikiquery -s or -q "[[isA::Event series]]"
json

but

wikiquery -s or -q "[[Category:Event series]]" | wc -l
1048
wikiquery -s or -q "[[Category:Event_series]]" | wc -l
1048

so what is the problem?

tholzheim commented 3 years ago

The queries of the two commands:

wikiquery -s or -q "[[isA::Event_series]]"
wikiquery -s or -q "[[isA::Event series]]"

are converted into the same request: https://www.openresearch.org/mediawiki/api.php?query=%5B%5BisA%3A%3AEvent_series%5D%5D%7Coffset%3D0&action=ask&format=json by changing the underscore to an escaped blank (%20) the query is executed as expected: https://www.openresearch.org/mediawiki/api.php?query=%5B%5BisA%3A%3AEvent%20series%5D%5D%7Coffset%3D0&action=ask&format=json

I have set up a test page showing this behavior: https://www.openresearch.org/wiki/Issue:Query_Underscore

But on my test wiki this issue does not occur. To me it looks like the problem is in the wiki and not in the code but we could bypass this odd behavior by altering the escaping as mentioned above.

WolfgangFahl commented 3 years ago

We might want to use the official escaping code. mwclient and pywikibot both should have it somewhere. It's a long known issue and also hit me in https://github.com/WolfgangFahl/Mediawiki-Japi/issues/50

WolfgangFahl commented 1 year ago

This is an exotic case do not use blanks in identifiers and the problem goes away