dlt-hub / dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
https://dlthub.com/docs
Apache License 2.0
2.55k stars 170 forks source link

Empty String Cursor Value from Zoom API Leading to Pagination Issue in `JSONResponseCursorPaginator` #2012

Open kang8 opened 1 day ago

kang8 commented 1 day ago

dlt version

dlt 1.3.0

Describe the problem

I encountered an issue when using the JSONResponseCursorPaginator class in rest_client for pagination requests while interacting with Zoom's API. The update_state() method is implemented as follows:

https://github.com/dlt-hub/dlt/blob/75ae23f174e5357a01612e48fe8a475c2db9db35/dlt/sources/helpers/rest_client/paginators.py#L651-L654

The problem occurs when values is equal to [""], resulting in self._next_reference being set to "" instead of None. This leads to pagination requests continuing to the next page, whereas I expect it to be None to stop further requests.

Expected behavior

The expected behavior is for self._next_reference to be set to None when values is [""], thus preventing further pagination requests. This matches the anticipated termination condition logic.

Steps to reproduce

  1. Use the JSONResponseCursorPaginator to implement pagination requests.
  2. When the value extracted from the JSON response using the specified path is [""], observe the pagination behavior.
  3. Note whether self._next_reference is set to "" instead of None.

Operating system

Linux

Runtime environment

Kubernetes

Python version

3.11

dlt data source

ZOOM api, e.g. GET /users

dlt destination

Snowflake

Other deployment details

No response

Additional information

I plan to submit a PR to address this issue.

burnash commented 1 day ago

Thanks for reporting this @kang8. Let me know if you'd be interested in submitting a PR with a fix.

kang8 commented 1 day ago

Yes, I'll submit a PR around tomorrow night(my timezone is UTC +08:00)

burnash commented 1 day ago

Perfect, thank you!