Open Scarvy opened 1 week ago
The issue was in the API wrapper (pyreadwise) not requesting the next page in the pagination. Based on the API documentation, the /export/
endpoint uses the parameter pageCursor
while the other endpoints like /highlights/
use page
.
pageCursor
– (Optional) A string returned by a previous request to this endpoint. Use it to get the next page of books/highlights if there are too many for one request.
page
– specify the pagination counter.
I made a quick fix like this that seems to work. I need to ensure it does not break the other pagination endpoint requests.
def _get_pagination(
self,
get_method: Literal['get', 'get_with_limit_20'],
endpoint: str,
params: dict = {},
page_size: int = 1000,
) -> Generator[dict, None, None]:
'''
Get a response from the Readwise API with pagination.
Args:
get_method: Method to use for making requests
endpoint: API endpoint
params: Query parameters
page_size: Number of items per page
Yields:
dict: Response data
'''
if endpoint == "/export/":
pageCursor = None
while True:
if pageCursor:
params.update({"pageCursor": pageCursor})
logging.debug(f'Getting page with cursor "{pageCursor}"')
try:
response = getattr(self, get_method)(endpoint, params=params)
except ChunkedEncodingError:
logging.error(f'Error getting page with cursor "{pageCursor}"')
sleep(5)
continue
data = response.json()
yield data
if (
isinstance(data, list)
or not data.get("nextPageCursor")
or data.get("nextPageCursor") == pageCursor
):
break
pageCursor = data.get("nextPageCursor")
else:
page = 1
while True:
response = getattr(self, get_method)(
endpoint, params={"page": page, "page_size": page_size, **params}
)
data = response.json()
yield data
if isinstance(data, list) or not data.get("next"):
break
page += 1
I am deciding whether to create a pull request in the original API wrapper repo or write my own. I'm leaning toward making a pull request.
I'm suspicious that the
export_highlight
function is not giving me all my highlights.I ran it recently and only received 82 total notes in Apple Notes. I have way more than that...
I think it has something to do with the generator or the API wrapper I'm using.