Asana / python-asana

Official Python client library for the Asana API v1
MIT License
303 stars 104 forks source link

sync token is only updated on the first EventIterator#next call #200

Open trunneml opened 3 months ago

trunneml commented 3 months ago

When calling the EventIterator the sync token is only updated the first time.

It looks like that the else block in line 28 should be part of the if block in line 31.

https://github.com/Asana/python-asana/blob/bab9fe81d808ced6788ae5464ae075db8105f184/asana/pagination/event_iterator.py#L26

jv-asana commented 3 months ago

Hi @trunneml,

Thank you for the suggestion. I do think there are some issues with us trying to implement auto pagination with our events endpoint though. We might have to reconsider our implementation or remove it for the events endpoint.

There are still issues even with your suggested fix. For example, let's say we have 101 events. This will call Get events on a resource (GET /events) endpoint twice with the same sync token and then new sync token after. We ideally don't want to be making the same call twice. Here's a scenario to explain this situation:

Let's say this is the suggested change:

  def __next__(self):
    if not self.has_more:
        raise StopIteration

    result = {}

    try:
        result = self.call_api()
    except ApiException as e:
        if (e.status == 412):
            errors = json.loads(e.body.decode("utf-8"))
            self.sync = errors["sync"]
        else:
            raise e

    if (self.sync):
        self.api_request_data["query_params"]["sync"] = self.sync

    if not result:
        try:
            result = self.call_api()
        except ApiException as e:
            raise e
    else:
        self.sync = result.get('sync', None)

    self.has_more = result.get('has_more', False)
    return result["data"]

Since "Asana limits a single sync token to 100 events" if we have 101 tasks the has_more in the response will be true. So the first time we make the request it will run into a 412 error and store the sync token then it'll make an api call with that sync token and return 100 events:

    if not result:
        try:
            result = self.call_api()
        except ApiException as e:
            raise e
    else:
        self.sync = result.get('sync', None)

The second time it runs it'll skip the stop iteration block since has_more is true. It will then make the API call again but with the same sync token from the first call:

    try:
        result = self.call_api()
    except ApiException as e:
        if (e.status == 412):
            errors = json.loads(e.body.decode("utf-8"))
            self.sync = errors["sync"]
        else:
            raise e

Then at some point it'll hit your change and store the new sync token for the next API call:

    if not result:
        try:
            result = self.call_api()
        except ApiException as e:
            raise e
    else:
        self.sync = result.get('sync', None)

After that i'll return the same events as the first call since it gets to the following line:

        self.has_more = result.get('has_more', False)
        return result["data"]

Since has_more is still true it'll call the API a third time. This time with the new sync token returning the 1 event

The issue in this scenario is GET /events is being called twice with the same token so the caller will get two events that are the same. This is why I am thinking we should not implement auto pagination for this endpoint and let the user have control over how they would like to make get events calls

jv-asana commented 3 months ago

Another reason why it might be better for us not to auto paginate events is because users might have a preference on how they implement events.

OPTION 1: Event stream -> keep calling GET /events and don't stop

OPTION 2: Get events one time:


As a work around to your issue, I recommend you use our Disable pagination behavior for a single request (disable pagination) functionality to make an API call to this particular endpoint. This will let you control your implementation on getting events.

Here's some sample code on how to do this:

import json
import time
import asana
from asana.rest import ApiException
from pprint import pprint

configuration = asana.Configuration()
configuration.access_token = '<YOUR_ACCESS_TOKEN>'
api_client = asana.ApiClient(configuration)

# create # create an instance of the API class
events_api_instance = asana.EventsApi(api_client)
resource = "<TASK_GID>" # EX: task_gid
sync = None
opts = {}

# First API call to get_events this will fail and return us a sync token
try:
    events = events_api_instance.get_events(resource, opts, full_payload=True)
    pprint(events)
except Exception as e:
    if (e.status == 412):
        print("Saving sync token")
        errors = json.loads(e.body.decode("utf-8"))
        sync = errors["sync"]
    else:
        print("Exception when calling TasksApi->get_tasks: %s\n" % e)

# Set a 10 second delay for you to trigger an event in Asana (EX: add comment to task)
time.sleep(10)

# Make a follow up API call to get_events with the sync token
try:
    opts = {
        'sync': sync
    }
    events = events_api_instance.get_events(resource, opts, full_payload=True)
    pprint(events)
except Exception as e:
    print("Exception when calling TasksApi->get_tasks: %s\n" % e)

Sample terminal output:

Saving sync token
{'data': [{'action': 'added',
           'created_at': '2024-07-29T20:46:09.314Z',
           'parent': {'gid': '123',
                      'name': 'Task 1',
                      'resource_subtype': 'default_task',
                      'resource_type': 'task'},
           'resource': {'created_at': '2024-07-29T20:46:09.097Z',
                        'created_by': {'gid': '456',
                                       'name': 'user@example.com',
                                       'resource_type': 'user'},
                        'gid': '789',
                        'resource_type': 'story',
                        'text': 'hello',
                        'type': 'comment'},
           'type': 'story',
           'user': {'gid': '456',
                    'name': 'user@example.com',
                    'resource_type': 'user'}}],
 'has_more': False,
 'sync': '9ldeb44b7c90d6sb6da91m8b7e6fad0c:0'}