goldsmith / Wikipedia

A Pythonic wrapper for the Wikipedia API
https://wikipedia.readthedocs.org/
MIT License
2.89k stars 519 forks source link

"Creatity page not found" when retrieving page for "Creativity" #304

Open hobson opened 2 years ago

hobson commented 2 years ago

python 3.7.9

>>> import wikipedia
>>> wikipedia.__version__
1.4.0

Some pages can't be found, even if you use the exact title currently on Wikipedia for a popular page or use the .suggested article title:

>>> wikipedia.summary('Creativity')
PageError: Page id "creatity" does not match any pages. Try another id!
>>> wikipedia.page('Creativity')
PageError: Page id "creatity" does not match any pages. Try another id!
>>> wikipedia.suggest('Creativity')
'creatity'
>>> wikipedia.search('Creativity')
['Creativity',
 'Creativity (religion)',
 'Creativity and mental health',
...
PageError: Page id "creatity" does not match any pages. Try another id!
>>> wikipedia.page('creativity')
PageError: Page id "creatity" does not match any pages. Try another id!

Lowercasing, etc. doesn't help, but adding the "(religion)" qualifier does, unless you're not looking for the religion page. Installing with conda or pip gives the same behavior.

theptrk commented 3 months ago

For "Creativity" the suggestion is "creatity" (but not sure why)

The error seems to stem from here.

Someone should just make a PR that changes that priority.

Its right here: https://github.com/goldsmith/Wikipedia/blob/1554943e8ab463cef5e93081def48fafbdef324e/wikipedia/wikipedia.py#L272

    if auto_suggest:
      results, suggestion = search(title, results=1, suggestion=True)
      try:
        title = suggestion or results[0]
      except IndexError:
        # if there is no suggestion or search results, the page doesn't exist
        raise PageError(title)