siznax / wptools

Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis
MIT License
574 stars 78 forks source link

Check for existence of an article #128

Closed antifa-ev closed 6 years ago

antifa-ev commented 6 years ago

How can I check for existence of a page before parsing it?

siznax commented 6 years ago

Thanks for trying wptools @antifa-ev!

LookupError is raised when a page cannot be found.

>>> import wptools
>>> page = wptools.page('asdf;lkj')
>>> page.get()
en.wikipedia.org (query) asdf;lkj
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "wptools/page.py", line 629, in get_query
    self._get('query', show, proxy, timeout)
  File "wptools/core.py", line 183, in _get
    self._set_data(action)
  File "wptools/page.py", line 190, in _set_data
    self._set_query_data(action)
  File "wptools/page.py", line 285, in _set_query_data
    data = self._load_response(action)
  File "wptools/core.py", line 216, in _load_response
    raise LookupError(_query)
LookupError: https://en.wikipedia.org/w/api.php?action=query&exintro&formatversion=2&inprop=url|watchers&list=random&pithumbsize=240&pllimit=500&ppprop=disambiguation|wikibase_item&prop=extracts|info|links|pageassessments|pageimages|pageprops|pageterms|redirects&redirects&rdlimit=500&rnlimit=1&rnnamespace=0&titles=asdf%3Blkj

So, to answer your question, you could check like this:

import wptools

page = wptools.page('asdf;lkj')

try:
    page.get_parse()
except LookupError:
    print('Not found.')
kreiche commented 3 years ago

That exception works great except there is still an API error printed. Is there a way to catch that in an exception? I attached a screenshot of your code from above. (My import line is in a different cell)

Screen Shot 2020-12-07 at 2 57 29 PM