pomerium / datasource

external data source contrib
Apache License 2.0
1 stars 1 forks source link

azure directory timeouts #151

Open calebdoxsey opened 1 year ago

calebdoxsey commented 1 year ago

What happened?

Occasionally when doing large sync calls with the Azure Microsoft Graph Delta API we are seeing timeouts:

{
  "level":"debug",
  "method":"GET",
  "authority":"[graph.microsoft.com](http://graph.microsoft.com/)",
  "path":"/v1.0/groups/delta",
  "duration":686.488539,
  "response-code":401,
  "idp":"azure",
  "response-body":"{\"error\":{\"code\":\"InvalidAuthenticationToken\",\"message\":\"Access token has expired or is not yet valid.\",\"innerError\":{\""
}

What did you expect to happen?

For large directory syncs to succeed.

Additional context

We should investigate a few different ideas here:

  1. Why are the access tokens not refreshing as we expect? Each API call (even for thousands of delta API calls) should check if the token is valid before re-using it. Are we hitting an issue where the token is valid only for a few seconds before it expires, so we-reuse it, but then it fails?
  2. Perhaps we should implement a simple retry mechanism to force a refresh of the access token when this error occurs and retry the API call
  3. Is there something in the delta API we can use to resume where we left off? Maybe $skiptoken?
  4. Can we increase the access token expiration timestamp somehow?

Locally we should be able to reproduce the behavior by adding an artifical lag to all requests. The default access token expiration is between 60 and 90 minutes.

calebdoxsey commented 1 year ago

We already implemented retries when access tokens expire.

calebdoxsey commented 1 year ago

Confirmed:

<nil> DBG http-request authority=graph.microsoft.com duration=129.826292 idp=azure method=GET path=/v1.0/users/delta response-body="{\"error\":{\"code\":\"InvalidAuthenticationToken\",\"message\":\"Access token has expired or is not yet valid.\",\"innerError\":{\"date\":\"2023-03-29T03:35:13\",\"request-id\":\"e5192c2d-960b-48ea-bc21-979ae759f671\",\"client-request-id\":\"e5192c2d-960b-48ea-bc21-979ae759f671\"}}}" response-code=401
<nil> DBG http-request authority=login.microsoftonline.com duration=430.282791 idp=azure method=POST path=/a0860b16-674e-426d-9165-f87469a8282d/oauth2/v2.0/token response-code=200

The token expires, we clear it and retry the next request, which succeeds.