influxdata / influxdb-client-python

InfluxDB 2.0 python client
https://influxdb-client.readthedocs.io/en/stable/
MIT License
706 stars 185 forks source link

Restore paging on find Tasks api #614

Closed snape81 closed 9 months ago

snape81 commented 10 months ago

Proposal: Why do you avoid to return links attribute of the task api response ? Can you restore the possibility to use next link? Or introduce a

 def find_tasks_paged(self, **kwargs):
        """List all tasks.

        :key str name: only returns tasks with the specified name
        :key str after: returns tasks after specified ID
        :key str user: filter tasks to a specific user ID
        :key str org: filter tasks to a specific organization name
        :key str org_id: filter tasks to a specific organization ID
        :key int limit: the number of tasks to return
        :return: Tasks
        """
        return self._service.get_tasks(**kwargs)

with also links sectin in response?

Current behavior: No link with next task id will be returned because only tasks will be returned

    def find_tasks(self, **kwargs):
        """List all tasks.

        :key str name: only returns tasks with the specified name
        :key str after: returns tasks after specified ID
        :key str user: filter tasks to a specific user ID
        :key str org: filter tasks to a specific organization name
        :key str org_id: filter tasks to a specific organization ID
        :key int limit: the number of tasks to return
        :return: Tasks
        """
        return self._service.get_tasks(**kwargs).tasks

Desired behavior: Have the next link into the response to implement pagination and retrieve all tasks (if more than 500) or use smaller chunk during the inquiry

Use case: Because no pagination method be possible

powersj commented 10 months ago

Hi,

The after parameter can be used to grab tasks after the 500th item. For example, if your last task ID is 500, pass 500 as the after value, and you should get the next 500 tasks. Here is the API reference: https://docs.influxdata.com/influxdb/v2/api/#operation/GetTasks

Can you try that and let us know if that works?

Thanks!

snape81 commented 10 months ago

Hi yes i confirm that using after you can grab next N task (500 max but i can specify also lower limit in order to test pagination ... example with 3 task of 10 configured)

The problem is about python ( and java ) client library. In particular what you have to pass into 'after' parameter (id of the last task retrieved) is in LINKS section of self._service.get_tasks(**kwargs) that is cutted of by returning only the content of tasks list retrieved as you can see in the body of actual find_task function implementation

"next": "/api/v2/tasks?after=0bf5171e0431a000&limit=3


 links": {
        "self": "/api/v2/tasks?after=0bd9f6fc3ff1a000&limit=3",
        "next": "/api/v2/tasks?after=0bf5171e0431a000&limit=3"
    },
    "tasks": [
        {
            "links": {
                "labels": "/api/v2/tasks/0bd9f7f4c0f1a000/labels",
                "logs": "/api/v2/tasks/0bd9f7f4c0f1a000/logs",
                "members": "/api/v2/tasks/0bd9f7f4c0f1a000/members",
                "owners": "/api/v2/tasks/0bd9f7f4c0f1a000/owners",
                "runs": "/api/v2/tasks/0bd9f7f4c0f1a000/runs",
                "self": "/api/v2/tasks/0bd9f7f4c0f1a000"
            },
            "labels": [],
            "id": "0bd9f7f4c0f1a000",
            "orgID": "2412f2e686c82331",

So 'after' in api calls works well but with this implementation i have no chance to get the right 'after' value to grab next pages

I had to Implement my ugly "find all task" function to retrieve all tasks using smaller slot after some rev eng of the python library and use directly the service layer instead api client layer

here my code hope is more clear (ah ... same problem also with java implementation )

 def _get_all_tasks():
    tasks = []
    with InfluxDBClient(url=INFLUXDB_URL, token=INFLUXDB_TOKEN, org=INFLUXDB_ORG, debug=True) as client:
        next_token = None
        while True:
            task_service = TasksService(client.api_client)
            if not next_token:
                task_response = task_service.get_tasks(org=INFLUXDB_ORG, limit=TASK_MAX_PAGE)
            else:
                task_response = task_service.get_tasks(org=INFLUXDB_ORG, limit=TASK_MAX_PAGE, after=next_token)
            tasks.extend(task_response.tasks)
            if not task_response:
                break
            if task_response.links.next is None:
                break
            else:
                query_string_items = str(task_response.links.next).split('?')[1].split('&')
                for qs_item in query_string_items:
                    if 'after=' in qs_item:
                        next_token = qs_item.split('=')[1]
    return tasks
powersj commented 10 months ago

@snape81,

Thanks for the response. We can take a look at adding an iterator function to make retrieving the paged content a bit easier.