Pagination breaks when returned @next value contains a '+'

ElsevierDev / elsapy

A Python module for use with Elsevier's APIs: Scopus, ScienceDirect, others.

BSD 3-Clause "New" or "Revised" License

357 stars 141 forks source link

debugging credit - Dave Santucci from Scopus.

This problem is due to url encoding. If using the cursor parameter to navigate pages, it would be necessary to url encode the cursor string before passing it for the next query.

import urllib.parse as urlencode

Assuming the returned cursor link is stored in a cursor variable and your page return is stored in a page variable

next_cursor = urlencode.quote(cursor)

This should solve the problem.

Another potential way is to extract the next page URI and pass it directly in the next call. You can access this link as the value of page['search-results']['link']['@href'] if page['search-results']['link']['@ref'] == "next"

The correct code for this would be something like: next_URI = [page['search-results']['link']['@href'] if page['search-results']['link']['@ref'] == "next" for item in page['search-results']['link']]

FYI - I have not yet tested the code above.

ElsevierDev / elsapy

Pagination breaks when returned @next value contains a '+' #47