mwclient / mwclient

Python client library to interface with the MediaWiki API
https://pypi.org/project/mwclient/
MIT License
316 stars 91 forks source link

Can't get random() or recentchanges() to work #187

Closed nnt0 closed 2 months ago

nnt0 commented 6 years ago

Hi,

im trying to use random() and recentchanges() but everytime i print the output of one of them i get this:

>>random = site.random(namespace=0, limit=10) >>random <List object 'random' for <Site object 'test.wikipedia.org/w/'>>

and the same for recentchanges()

>>> recent = site.recentchanges() >>> recent <List object 'recentchanges' for <Site object 'test.wikipedia.org/w/'>>

I can't figure out what i'm doing wrong. Im using Python 3. Thanks.

danmichaelo commented 6 years ago

Hi,

Thesere are iterators, so you must pull items from them. A quick way to convert an iterator to a list is to use list():

random = site.random(namespace=0, limit=10)
list(random)

or you can use

next(random)

to just get the first item.

In general, you will want to process an iterator using a for loop or similar:

for page in site.random(namespace=0, limit=10):

do something with page

nnt0 commented 6 years ago

Thank you for your answer. I tried it and it worked but i get way more than my limit.

Using

for page in site.random(namespace=0, limit=10): print(page)

gives me way to many sites.

danmichaelo commented 6 years ago

Ouch, didn't know that. I'm not the original author of this library, I just maintain it. Guess the limit is just the number of items it retrieves in each batch then. That should at least be documented.

In that case you can either do something like this to get the first 10 elements:

pages = site.random(namespace=0)
pages = [next(pages) for x in range(10)]

or use itertools:

import itertools
pages = site.random(namespace=0)
pages = itertools.islice(pages, 10)
marcfrederick commented 2 months ago

This issue was more recently discussed in #259. The limit parameter is confusing and controls the size of chunks fetched from the API, not the number of results.

To fix this, we’ve replaced limit with two new parameters api_chunk_size and max_items in #343. This change is included in v0.11.0. To get 10 random pages, use:

import mwclient

site = mwclient.Site('en.wikipedia.org')
for page in site.random(namespace=0, max_items=10):
    print(page)