turbot / steampipe-plugin-okta

Use SQL to instantly query users, groups, applications and more from Okta. Open source CLI. No DB required.
https://hub.steampipe.io/plugins/turbot/okta
Apache License 2.0
8 stars 4 forks source link

Bug: `okta_user` and `okta_factor` table fails with `context canceled` error. #77

Closed LalitLab closed 1 year ago

LalitLab commented 2 years ago

Describe the bug okta_user and okta_factor table fails with context canceled errors. It seems to fail due to a timeout issue for a larger number of users in the account.

There should be a better mechanism to retry for such errors.

 WITH OKTA_MFA as (
  SELECT 
    user_id
  FROM
    okta_factor
  WHERE
    status = 'ACTIVE'
  AND
    factor_type IN ('push','token:software:totp')
  ORDER BY
    user_id, factor_type
), 
OKTA_USERS as (
  SELECT
    id,
    email,
    status,
    last_login
  FROM
    okta_user
  ORDER BY
    id, email
)
SELECT
    U.email as resource,
    CASE
        WHEN U.status <> 'ACTIVE' THEN 'skip'
        WHEN COUNT(F.user_id) = 0 THEN 'alarm'
        ELSE 'ok'
    END AS status,
    CASE
        WHEN U.status <> 'ACTIVE' THEN 'User ' || u.email || ' is not active.'
        WHEN COUNT(F.user_id) = 0 THEN 'User ' || u.email || ' does not have MFA configured.'
        ELSE 'User ' || u.email || ' is ok'
    END AS reason,
    U.email,
    U.last_login
FROM
    OKTA_USERS U
LEFT JOIN OKTA_MFA F on F.user_id = U.id
GROUP BY
    U.email,
    U.status,
    U.last_login
2022-09-09 07:50:11.972 UTC [WARN] PluginManager setPluginCacheSizeMap: 6 connections.
2022-09-09 07:50:11.972 UTC [WARN] Total cache size 0Mb
2022-09-09 07:50:14.481 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709814946: listOktaUsers: list_users_error="Get "https://xxxxx.okta.com/api/v1/users?limit=200": context canceled"
2022-09-09 07:50:14.481 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709814927: listOktaUsers: list_users_error="Get "https://xxxxx.okta.com/api/v1/users?limit=200": context canceled"
2022-09-09 07:50:14.481 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709814927: streamRows error chan select: Get "https://xxxxx.okta.com/api/v1/users?limit=200": context canceled
2022-09-09 07:50:14.481 UTC [WARN] steampipe-plugin-okta.plugin: [WARN] 1662709814927: Execute call failed - cancelling pending item in cache
2022-09-09 07:50:14.481 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709814927: streamRows error chan select: Get "https://xxxxx.okta.com/api/v1/users?limit=200": context canceled
2022-09-09 07:50:14.481 UTC [WARN] steampipe-plugin-okta.plugin: [WARN] 1662709814927: Execute call failed - cancelling pending item in cache
2022-09-09 07:50:24.004 UTC [WARN] PluginManager setPluginCacheSizeMap: 6 connections.
2022-09-09 07:50:24.004 UTC [WARN] Total cache size 0Mb
2022-09-09 07:50:58.971 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: listOktaFactors: list_factors_error="context deadline exceeded"
2022-09-09 07:50:58.972 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: listOktaFactors: list_factors_error="context deadline exceeded"
2022-09-09 07:50:58.973 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: streamRows error chan select: context deadline exceeded
2022-09-09 07:50:58.973 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: error chan select: context deadline exceeded
2022-09-09 07:50:58.973 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: listOktaFactors: list_factors_error="context deadline exceeded"
2022-09-09 07:50:58.973 UTC [WARN] steampipe-plugin-okta.plugin: [WARN] 1662709826246: Execute call failed - cancelling pending item in cache
2022-09-09 07:50:58.973 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: listOktaFactors: list_factors_error="context deadline exceeded"
2022-09-09 07:50:58.973 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: listOktaFactors: list_factors_error="context canceled"
2022-09-09 07:50:58.974 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: listOktaFactors: list_factors_error="context canceled"
2022-09-09 07:50:58.974 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: listOktaUsers: list_users_paging_error="context canceled"
2022-09-09 07:50:58.974 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: listOktaFactors: list_factors_error="context canceled"
2022-09-09 07:50:58.975 UTC [ERROR] steampipe-plugin-okta.plugin: [ERROR] 1662709826246: listOktaFactors: list_factors_error="context canceled" (edited) 

Steampipe version (steampipe -v) Example: v0.3.0

Plugin version (steampipe plugin list) Example: v0.5.0

To reproduce Steps to reproduce the behavior (please include relevant code and/or commands).

Expected behavior A clear and concise description of what you expected to happen.

Additional context Slack thread

github-actions[bot] commented 2 years ago

'This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 30 days.'

github-actions[bot] commented 1 year ago

'This issue was closed because it has been stalled for 90 days with no activity.'

misraved commented 1 year ago

Relevant slack thread - https://steampipe.slack.com/archives/C01UECB59A7/p1658406892068329

Okta documentation on rate limit errors - https://github.com/okta/okta-sdk-golang#connection-retry--rate-limiting

ciaran-finnegan commented 1 year ago

Caching the Okta tables prior to running the problematic query appears to workaround this issue, e.g.

steampipe query "SELECT COUNT(*) FROM okta_user" steampipe query "SELECT COUNT(*) FROM okta_factor"

massyn commented 1 year ago

After a lot of research, we will now be abandoning this plugin, and opt for a manual script to extract the data out of Okta.

From what I can tell, it would seem that the plugin is aggressively querying the Okta API. In our current environment, we have around 800 users. Even with 800 users, assuming the plugin makes a single call, we should (theoretically) not even hit 1000 API calls, and thus not it the rate limit of the Okta API, yet somehow it would seem the plugin is not just making 1 API call per user.

Please add a mechanism to throttle the API queries, or identity the bug that is causing the plugin to make excessive calls to the Okta backend.

misraved commented 1 year ago

Hello @massyn, we are extremely apologetic for the radio silence on this issue. We could have definitely taken a better approach to fix this issue.

We have raised a PR -https://github.com/turbot/steampipe-plugin-okta/pull/80 wherein we are going to use the API retry mechanism provided by the Steampipe plugin SDK. We will test it out aggressively to make sure that we are able to handle the throttling of API queries.

Once again, sorry for the delay, we will look to get the fix out as early as possible 👍.

misraved commented 1 year ago

@massyn @ciaran-finnegan apologies for the delay on this issue. I have raised a PR that fixes the rate limit issue - https://github.com/turbot/steampipe-plugin-okta/pull/80

Could you please give it a try by building it locally and let us know if it fixes your issue?

massyn commented 1 year ago

Hi @misraved - I am busy looking into this. Can you please clarify what is needed to be changed? The https://github.com/okta/okta-sdk-golang#connection-retry--rate-limiting page is specific to the Okta plugin, but there is no documentation yet on what to update on the okta.spc file.

With a limit of 600 API calls per minute, what parameters need to be set?

Warning: failed to start plugin 'hub.steampipe.io/plugins/turbot/okta@latest': failed to decode connection config for connection 'okta':
Unsupported argument: An argument named "DefaultRetryConfig" is not expected here.
misraved commented 1 year ago

Thanks for the quick response @massyn 👍. My apologies for not enlisting the steps to test out the https://github.com/turbot/steampipe-plugin-okta/pull/80.

  1. You need not update the okta.spc file to test out the changes. After installing the plugin, please configure your credentials per https://hub.steampipe.io/plugins/turbot/okta#configuration
  2. Please clone this repository by executing the following commands:
    git clone https://github.com/turbot/steampipe-plugin-okta.git
    cd steampipe-plugin-okta
  3. Switch to the PR branch: git checkout fix-rate-limit-error
  4. Build the plugin with the changes by running the make command.
  5. Please run the affected queries to check if the rate limit issues still persist.
massyn commented 1 year ago

I'm afraid the issue still persist. See my log below.. I uninstalled the plugin, to be sure it is gone. I manually installed the new plugin, and ran the query. The error still occurred.

➜  ~ steampipe plugin uninstall okta

Uninstalled plugin:
* turbot/okta

Please remove this connection to continue using steampipe:

  * /Users/massyn/.steampipe/config/okta.spc
         'okta' (line  1)

➜  ~ cd tmp
➜  tmp cd steampipe-plugin-okta 
➜  steampipe-plugin-okta git:(fix-rate-limit-error) make
go build -o ~/.steampipe/plugins/hub.steampipe.io/plugins/turbot/okta@latest/steampipe-plugin-okta.plugin *.go
➜  steampipe-plugin-okta git:(fix-rate-limit-error) steampipe query                
Welcome to Steampipe v0.18.6
For more information, type .help
> SELECT DISTINCT
        F.user_id
    FROM
        okta_factor F
    WHERE
        F.factor_type IN ('push','token:software:totp') 
        AND F.status = 'ACTIVE' ;

Error: rpc error: code = DeadlineExceeded desc = context deadline exceeded (SQLSTATE HV000)

+---------+
| user_id |
+---------+
+---------+
> 
misraved commented 1 year ago

Thanks for the feedback @massyn 👍. Let me take another dive into it and try to come up with a better solution 👍.

misraved commented 1 year ago

Hello @massyn, I have pushed some more changes to the PR wherein I have reduced the max concurrency and added RetryConfig to hydrate configs. Could you please pull the latest code from the branch and retest your queries?

1. git pull origin fix-rate-limit-error
2. make
3. steampipe query
massyn commented 1 year ago

Hi @misraved - I am afraid it is still failing

➜  steampipe-plugin-okta git:(fix-rate-limit-error) git pull
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Total 5 (delta 4), reused 5 (delta 4), pack-reused 0
Unpacking objects: 100% (5/5), 609 bytes | 121.00 KiB/s, done.
From https://github.com/turbot/steampipe-plugin-okta
   d758c49..9624389  fix-rate-limit-error -> origin/fix-rate-limit-error
Updating d758c49..9624389
Fast-forward
 okta/table_okta_group.go |  5 ++++-
 okta/table_okta_user.go  | 10 ++++++++--
 2 files changed, 12 insertions(+), 3 deletions(-)
➜  steampipe-plugin-okta git:(fix-rate-limit-error) make                    
go build -o ~/.steampipe/plugins/hub.steampipe.io/plugins/turbot/okta@latest/steampipe-plugin-okta.plugin *.go
➜  steampipe-plugin-okta git:(fix-rate-limit-error) steampipe query
Welcome to Steampipe v0.18.6
For more information, type .help
> SELECT DISTINCT
        F.user_id
    FROM
        okta_factor F
    WHERE
        F.factor_type IN ('push','token:software:totp') 
        AND F.status = 'ACTIVE' limit 10;

Error: rpc error: code = DeadlineExceeded desc = context deadline exceeded (SQLSTATE HV000)

+---------+
| user_id |
+---------+
+---------+
> SELECT DISTINCT
        F.user_id
    FROM
        okta_factor F
    WHERE
        F.factor_type IN ('push','token:software:totp') 
        AND F.status = 'ACTIVE';

Error: rpc error: code = DeadlineExceeded desc = context deadline exceeded (SQLSTATE HV000)

+---------+
| user_id |
+---------+
+---------+
> 
misraved commented 1 year ago

Thanks @massyn for the quick feedback, could you please share the plugin level logs for the above query?

Also, do you see similar errors when you try other tables like okta_user, or is it just this table that returns the error?

massyn commented 1 year ago

Hi @misraved - Sorry for the delay, I had a business trip to attend to. Find attached the log as requested. It only happens with the okta_factor table. From what I can tell, the plugin makes individual connections to the API for every user it finds in the okta_user table, and this excessive API calls then results in the okta API causing the rate limit.

plugin-2023-03-20.log

misraved commented 1 year ago

Apologies for the delay @massyn, let me take a quick look at the logs and see if I can find a pattern for these errors.

Thanks a lot for your help 👍.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

bigdatasourav commented 1 year ago

Hello, @massyn; we have successfully identified the underlying cause of this issue. To resolve it, we need to implement a fix in the Steampipe SDK. We have already raised an issue for this and will provide updates on the progress.

misraved commented 1 year ago

The issue is waiting on https://github.com/turbot/steampipe-plugin-sdk/issues/572, once it is resolved we will reopen the issue 👍.