divinorum-webb / tableau-api-lib

An API library that allows developers to call on the methods listed in Tableau's REST API documentation.
MIT License
96 stars 34 forks source link

get_groups_for_a_user_dataframe() fails (sometimes) when run in a loop #107

Closed jeff-skoldberg-ct closed 5 months ago

jeff-skoldberg-ct commented 1 year ago

We have a program that uses tableau-api-lib that has been stable for about 2 years. The job runs every hour. Recently the job started failing about 95% of the time and it finishes successfully about 5% of the time.

The line of code that is failing is df = get_groups_for_a_user_dataframe(conn,user)

The error is:

tableau_api_lib.exceptions.tableau_server_exceptions.PaginationError:
        The Tableau Server REST API method decorator did not return paginated results.
        Please verify that your connection is logged in and has a valid auth token.
        If using personal access tokens, note that only one session can be active at a time using a single token.
        Also note that the extract_pages() method wrapping this call is intended for paginated results only.
        Not all Tableau Server REST API methods support pagination.

Please note we make several other calls leading up to this which are not failing.

When I run it in debug mode in vs-code it does not fail. When I just hit the run button in vs-code, it usually fails.

It seems that adding time.sleep(.5) immediately before each call of df = get_groups_for_a_user_dataframe(conn,user) solves the issue. But - this is not a good solve. This worked for the last two years, and ran every hour with 95% success, and now we have about 5% sucess, so something has changed.

Please let me know what other info you need.

jeff-skoldberg-ct commented 1 year ago

hmm, it seems that adding a sleep does not solve it as consistently as I thought. I increased the sleep to 1 second and I'm still getting the same errors. Error frequency has reduced slightly.

jeff-skoldberg-ct commented 1 year ago

After observing the affect of time.sleep(1) for the last 24 hours, it has failed 4 times in 24 hours. So, that is a huge improvement, but still not as stable as it was last week (and for the last ~2 years).

divinorum-webb commented 1 year ago

Hey @jeff-skoldberg-ct, thanks for taking the time to help improve the library!

That querying util function has not been updated recently (since about 2 years ago) so I'm wondering if one of these could be the issue:

  1. Is it possible some of the users your workflow attempts to query are not associated with any groups? I will do some testing around this as well to rule out the possibility that querying groups for a user who is not associated with any groups causes issues.
  2. Is it possible that your infrastructure team has implemented some sort of requests/second constraint which would lead to some API request failures if you issue a higher volume of API requests in a short amount of time?
  3. Did these issues begin after upgrading Tableau Server? Or are you on Tableau Online, which automatically updates? If so then it's possible Tableau's default API settings or REST API backend are related to this issue.

Number (3) above would be unfortunate but not too surprising. We have seen in the past year that changes to the backend of the REST API have impacted various endpoints in undesirable ways. For example, the _all_ fields parameter for fetching all available fields about users or projects changed and introduced issues to API clients such as this library which do not control how Tableau's code works on their end.

Hope this helps, and I'll follow up on (1) above to let you know if I see any issues with querying users who do not belong to any groups.

divinorum-webb commented 1 year ago

@jeff-skoldberg-ct following up on the previous message, I created a sample user on Tableau Server. By default they are a member of "All Users" and no other group. There were no errors when calling the get_groups_for_a_user_dataframe querying function on that new user.

It seems the users are bound to the "All Users" group and cannot be removed from it, so my conclusion is that the errors you are seeing do not have to do with bugs related to circumstantial payloads, which eliminates possibility (1) from the list above.

jeff-skoldberg-ct commented 1 year ago

@divinorum-webb thank you for the replies. We are using Tableau Online (now Tableau Cloud), and this was my first guess when we started facing this issue suddenly. I have updated the "API Version" parameter in my code, but that didn't help. Tableau must have changed something on their end. If that is the case, there may be nothing we can do?

Can you reproduce the issue?

Here's a larger excerpt from my code if you'd like to try:

users_df = get_users_dataframe(conn)

def user_group_df(user_df):
    final_df = pd.DataFrame()
    for key, value in user_df.iterrows():
        user = value['id']
        authSetting = value['authSetting']
        email = value['email']
        fullName = value['fullName']
        lastLogin = value['lastLogin']
        name = value['name']
        siteRole = value['siteRole']
        time.sleep(1)
        df = get_groups_for_a_user_dataframe(conn,user)
        df['userid']=user
        df['authSetting']=authSetting
        df['email']=email
        df['lastLogin']=lastLogin
        df['user_name']=name
        df['siteRole']=siteRole
        df['fullName']=fullName
        final_df = pd.concat([final_df,df])
    final_df.drop(columns=['domain'],inplace=True)
    return final_df

users_groups_df = user_group_df(users_df)
jeff-skoldberg-ct commented 5 months ago

This seems to be working now without sleeping between api calls. I suppose Tableau resolved the issue which was causing this.