miri64 / pycomicvine

A wrapper for comicvine.com
MIT License
20 stars 10 forks source link

Paging update to follow new API behaviour and work around an API bug #1

Closed xchewtoyx closed 11 years ago

xchewtoyx commented 11 years ago

The comicvine API now ignores the offset in a request, and instead uses the page parameter for large result sets. The current pycomicvine behaviour results in large numbers of duplicate results, as well as missing results.

This proposed code does two things - it adds the page parameter to the request, in order to get the additional results as expected, it also includes code to work around a bug in the API. Sometimes you will ask for 100 results, but only get 99. I tried a couple of approaches to this, but in the end just settled on returning None for missing entries. I played with a couple of other ideas for handling this, but this seemed easiest.

As examples of this in practice, when I use the code from HEAD to get a list of volumes containing "Batman" I get 742 results. When I de-duplicate these I am only left with 100. With the new code I get 742 unique results:

rgh@eduardo ~/git/pycomicvine $ python volumelist.py batman | sort -u | wc -l 100

rgh@eduardo ~/git/mycomicvine $ python volumelist.py batman | sort -u | wc -l 742

With a similar query for "Spider-man" (which shows the buggy behaviour described above) there is a similar discrepancy with the results. The search reports 632 issues, yet it can be seen that there are 100 unique results from the code at HEAD, and 630 from the revised code.

rgh@eduardo ~/git/pycomicvine $ python volumelist.py spider-man | sort -u | wc -l 100

rgh@eduardo ~/git/mycomicvine $ python volumelist.py spider-man | sort -u | wc -l 630

This patch works well for my calibre plugin, hopefully this will be useful for the base library too...

miri64 commented 11 years ago

Sorry I took so long I somehow missed this PR, but as you can see: tests are failing at pycomicvine/__init__.py, line 392 (https://travis-ci.org/authmillenon/pycomicvine/jobs/8128468). Seems like the line l commented on is to blame since offset always is 0.

miri64 commented 11 years ago

But on /search it seems to work…

miri64 commented 11 years ago

Cherry-picked it over on my then current branch and made it default behaviour just for Search.

xchewtoyx commented 11 years ago

Yeah, this API is more broken than I thought. Multi-page search requests use &page= , other multi-page results use &offset=, and the main API doc doesn't mention page at all... Not to mention the random missing results...

miri64 commented 11 years ago

Full ACK ^^