associations.batch_api.read does not support pagination?

fromfarland commented 1 year ago

Hi! My first issue here - after being able to figure a lot of things by myself, I finally ran into something that I can't seem to get my head around.

Is there a way to get more than 100 results from associations.batch_api.read? After checking the repo, it seems to me that the function does not support pagination; or rather, the API response does not include a paging key like others do.

Code snippet explanation: I wrote a function get_batch_associations that takes a list of IDs for a certain object type and gets all the associations to another object type. (This calls another function chunk_list that takes a list of IDs, then splits them into chunks of 100.)

All ideas are much appreciated!

def get_batch_associations(api_client,
                           from_type: str,
                           from_ids: list,
                           to_type: str):

logger.info(f"Getting associations of {len(from_ids)} '{from_type}' to '{to_type}' ...")

# Split the list of input IDs into chunks of size 100 to comply with the API limit, create empty dict to store the parsed results
chunked_ids = chunk_list(input = from_ids, n = 100)
res_dict = {'updated_at': [], 'from_id': [], 'from_type': [], 'to_id': [], 'to_type': [], 'relationship': []}

# Request and parse data for one chunk at a time
for current_chunk in chunked_ids:
    # Create a list of PublicObjectId objects for the IDs in the current chunk
    ids_for_api = [(PublicObjectId(id = id)) for id in current_chunk]
    # Call the API and convert the result to a dict
    api_response = api_client.crm.associations.batch_api.read(
        from_type,
        to_type,
        batch_input_public_object_id = BatchInputPublicObjectId(inputs = ids_for_api)
    ).to_dict()
    # Parse the dict structure, keeping only the relevant columns and append to the overall results dict
    for current_object in api_response.get('results'): # this returns one result per object ID that was put in at the beginning
        for current_relation in current_object.get('to'):
            res_dict['updated_at'].append(api_response.get('completed_at'))
            res_dict['from_id'].append(current_object.get('_from').get('id'))
            res_dict['from_type'].append(from_type)
            res_dict['to_id'].append(current_relation.get('id'))
            res_dict['to_type'].append(to_type)
            res_dict['relationship'].append(current_relation.get('type'))

df = pd.DataFrame.from_dict(res_dict)
logger.info(f"Found {len(df)} associations.")

fromfarland commented 1 year ago

Just saw that there is another issue open about this, sorry! https://github.com/HubSpot/hubspot-api-python/issues/135 Would still be interested in ideas though – I would really like to keep using associations.batch_api.read because it works for all object types.

alzheltkovskiy-hubspot commented 1 year ago

Hi @fromfarland. We working on it. It going to added as soon as we can.

fromfarland commented 1 year ago

@alzheltkovskiy-hubspot Good to know, thank you!

fromfarland commented 1 year ago

Hi, is there an update or an ETA here already? :) (The restriction to 100 association results is really hindering how we use the API and theferore renders our implementation somewhat pointless currently.)

alzheltkovskiy-hubspot commented 1 year ago

Hi everyone could you check the latest version?

fromfarland commented 1 year ago

Hi @alzheltkovskiy-hubspot, thanks for the update! I checked out the latest version, the response now contains the missing paging key which is great!

However, I did not figure out how use this as a parameter / paginate through results using associations.batch_api.read. Looking at the code briefly, it seems like the parameter is just not available in the function? So currently I know that the results are paginated, but can't do anything about it 😅 Am I missing something?

fromfarland commented 1 year ago

Hi @alzheltkovskiy-hubspot, it seems to me like this still is not really usable because there is no way to include the after (paging-related) key in the batch_api.read request.

Can you help out here? Thanks!

HubSpot / hubspot-api-python

associations.batch_api.read does not support pagination? #187