crustymonkey / py-bgg

A simple Board Game Geek (boardgamegeek.com) API library in Python. This mainly just handles the API calls and converts the XML to representative dict/list format
GNU General Public License v2.0
30 stars 8 forks source link

Support for batch "thing" requests? #11

Closed commadelimited closed 2 years ago

commadelimited commented 2 years ago

Doing some research on rate limiting and it appears you can issue a single request containing multiple "thing" IDs as seen in this BGG comment: https://boardgamegeek.com/thread/2388502/article/35435652#35435652

I think I might've figured out how to correctly pull lots of data from the Thing Items endpoint!
So instead of doing a bunch of individual requests such as:
/xmlapi2/thing?stats=1&id=188920
/xmlapi2/thing?stats=1&id=174476

Thoughts on supporting this functionality? That might be a great addition.

Side note, as a long time user of LCosmin's library, thanks for picking up the torch and supporting the BGG API via Python3.

commadelimited commented 2 years ago

Actually it looks like you already support it:

results = client.boardgame('177736,239472', stats=True)

Returns information about A Feast for Odin, and Abomination: The Heir of Frankenstein.

Might be nice to have a sugar method for this, or to include it in the documentation. Looks like in that same thread I posted people have determined there's an upper limit of around 1200-1300 IDs. So doing 500 at a time might be reasonable.

crustymonkey commented 2 years ago

This is a good callout and should be pretty easy to implement. And I agree, updating the docs here would also be helpful for this use case.

commadelimited commented 2 years ago

I'm testing right now and tried ~900 and it choked with a 429 error. But it's taking 100 at a time with no sweat.

commadelimited commented 2 years ago

FWIW here's some complete sample code. I used it to output game name and weight for a quick project I'm working on ;

#!/usr/bin/env python3

import json
import os

from libbgg.apiv2 import BGG as BGG2

client = BGG2()

game_ids = [
    [
        '271447',
        '173346',
        '68448',
        '177736',
    ],[
        '334187',
        '244909',
        '295295',
        '302809',
    ]
]

for game_set in game_ids:
    results = client.boardgame(','.join(game_set), stats=True)

    for game in results['items']['item']:
        name = game['name'][0]['value'] if type(game['name']) == list else game['name']['value']
        weight = game['statistics']['ratings']['averageweight']['value']
        print(name, '===', weight)

And here's the output:

Wild: Serengeti === 2.5
Wilson & Shep === 1
Wingspan === 2.443
Winner Winner Chicken Dinner === 1.5
crustymonkey commented 2 years ago

So, one thing worth noting is that you don't have to convert your lists of IDs to strings. That is handled automatically by the library such that:

client.boardgame(','.join([1, 2, 3, 4]))

produces the same thing as:

client.boardgame([1, 2, 3, 4])

The only thing that's missing in the library for batching is automatic batch limits of 100 per call or similar. That's fairly trivial to implement for this though.

crustymonkey commented 2 years ago

If you want to do batching, something like this should work:

BATCH_SIZE = 100
from libbgg.apiv2 import BGG as BGG2

client = BGG2()
ids = [1, 2, 3, 4, ...]

results = []
for i in range(0, len(ids), BATCH_SIZE):
    end = i + BATCH_SIZE
    chunk = client.boardgame(ids[i:end])
    results.extend(chunk['items']['item'])

And if you want to batch things concurrently, you could use concurrent.futures to run all your batches at the same time.

crustymonkey commented 2 years ago

I did look at integrating the sized batching in the library automatically, but it's not as simple as I thought it was going to be. I think it's actually easier to just do outside (example above) than it is to do it in the library.

commadelimited commented 2 years ago

Sounds good. Given the rate limitations, I'd suggest just adding this to the docs as an example. I'd wager most people are going to try and do it one at a time, without knowing there's a better way.