aws-amplify / amplify-js

A declarative JavaScript library for application development using cloud services.
https://docs.amplify.aws/lib/q/platform/js
Apache License 2.0
9.42k stars 2.12k forks source link

Error Graphql Error running query [query]. Authmode Cognito. Error: Unauthorized #13710

Open mattiLeBlanc opened 1 month ago

mattiLeBlanc commented 1 month ago

Before opening, please confirm:

JavaScript Framework

Angular

Amplify APIs

GraphQL API

Amplify Version

v6

Amplify Categories

auth, api

Backend

None

Environment information

```sh # Put output below this line System: OS: macOS 14.4.1 CPU: (10) arm64 Apple M1 Max Memory: 2.21 GB / 64.00 GB Shell: 5.9 - /bin/zsh Binaries: Node: 20.14.0 - /usr/local/bin/node Yarn: 1.22.18 - ~/.npm-global/bin/yarn npm: 8.19.1 - ~/.npm-global/bin/npm pnpm: 9.0.0 - ~/.npm-global/bin/pnpm Browsers: Chrome: 127.0.6533.100 Safari: 17.4.1 npmGlobalPackages: @angular/cli: 17.0.3 angular-http-server: 1.10.0 aws-cdk: 2.146.0 aws: 0.0.3-2 envinfo: 7.13.0 firebase-tools: 11.16.1 nativescript: 8.2.3 node-gyp: 8.4.1 npm: 8.19.1 pnpm: 9.0.0 yarn: 1.22.18 ```

Describe the bug

I am seeing the clientside JS error Error: Graphql Error running query getMusicStream. Authmode Cognito. Error: Unauthorized occur many times in our client logging system. It is caught by the custom Angular Error handler that will redirect to an Oops page which is okay for a real error, but not this one.

The cause is an expired accessToken, I am sure of it. The error was reported at 8:25 AM

image

When I look at the accessToken for this request (I logged that too), I see the expiry was at 8:25:

image

There is a valid Refresh token present.

The user did an action which resulted in a graphql Api call which caused the above issue and then resulted in an Oops page, instead of being issues a new accessToken by the graphql Api client code.

When does it issue not happen: I was logged into our production system, last time used yesterday, so my accesstoken is expired. I got to the same URL with my dev console open. The application works as expected, new token is retrieved which is valid for 15min and the page displays. No error. So the mechanism works fine this way.

The issue seems to be a live session where the accessToken expires and needs to be renewed; a graphql error slips through. Question: is that by design or a bug? Question: if I tell my custom error handler to ignore this, will the graphql Api attempt another call with a new accessToken, or will the JS Error mean the end of the execution and the application is dead as of that moment until a customer does a reload?

Expected behavior

I expect when an accessToken is expiring at the same moment a API call is made, I do not receive an UNAUTHORISED error. Maybe the grahpql client code should consider each token that is about to expire within 15 seconds as expired and retrieve a new accessToken so that the API call that follows will not receive the UNAUTHORISED error.

Maybe that is already how the system works, not sure.

Reproduction steps

-Login to cognito via Amplify client in an Angular app -Have an Graphql API call ready that requires you to be logged in to Cognito -wait for the accessToken to expire and run a couple of API calls to try to trigger this error. Sometimes it happens, sometimes it doesnt.

Code Snippet

// THIS IS MY API IMPLEMENTATION USING RXJS
export class ApiService {

  private client = generateClient();

  graphql<V>({ statement: query, variables, type, iam = false, familyApi = false }: queryInput) {
    if (familyApi) {
      Amplify.configure(awsConfig.familyApp);
    } else {
      Amplify.configure(awsConfig.adminApp);
    }

    const payload: any = {
      query,
      authMode: iam ? 'iam' : 'userPool',
    }
    if (variables) {
      payload.variables = variables;
    }
    return from(this.client.graphql(payload) as Promise<GraphQLResult<object>>)
      .pipe(
        retry({ count: 3, delay: 1000, resetOnSuccess: true}),
        map(res => res.data[type as keyof object]),
        catchError((error: GraphQLResult<object> | Error) => {
          // graphql error
          if ('errors' in error) {
            const gqError = error.errors![0];
            throw new Error(`Graphql Error running query ${type}. Authmode ${iam ? 'IAM' : 'Cognito'}. Error: ${gqError?.message} ${gqError?.path ? gqError.path.toString() : ''}`);
            // normal error
          } else if ('stack' in error) {
            throw new Error(`Runtime error running query ${type}. Authmode ${iam ? 'IAM' : 'Cognito'}. Error: ${error.stack}}`);
            // fallback for something else
          } else {
            throw new Error(`Unknown Error running query ${type}. Authmode ${iam ? 'IAM' : 'Cognito'}. Error: ${error}}`);
          }
        })
      ) as Observable<V>
  }
}

Log output

``` // Put your logs below this line ```

aws-exports.js

No response

Manual configuration

No response

Additional configuration

No response

Mobile Device

No response

Mobile Operating System

No response

Mobile Browser

No response

Mobile Browser Version

No response

Additional information and screenshots

No response

mattiLeBlanc commented 1 month ago

I was able to reproduce it locally by looking at my token expiry and creating an API call just at the expiry second:

error

I am going to try to NOT throw an new error in my CatchError clause and see if the Retry will try the request again with a now refreshed accessToken and maybe not bother the user with this.

But i think this PROOFS there is an issue on accessToken expiry, it may be not robust enough since it throws the JS error.

Update: I removed the throw New Error in the catchError block and I just now get a

{
  "errors" : [ {
    "errorType" : "UnauthorizedException",
    "message" : "Token has expired."
  } ]
}

The retry did not seem to work. Maybe an rxjs Retry doesn;t work together with a catchError statement. Not sure. But I would hope this accesstoken expiry issue can be resolved in the amplify library.

The error I got when using a retry is:

> UnauthorizedException: Unknown error
    at buildRestApiError (http://apollo-app:4201/@fs/Users/[user]/www/resparke/apollo/apollo/frontend/[app]-app/.angular/cache/18.1.4/vite/deps/aws-amplify_api.js?v=3b3418f5:7217:24)
    at http://apollo-app:4201/@fs/Users/[user]/www/[app]/apollo/apollo/frontend/[app]-app/.angular/cache/18.1.4/vite/deps/aws-amplify_api.js?v=3b3418f5:7171:12
    at Generator.next (<anonymous>)
    at fulfilled (http://apollo-app:4201/@fs/Users/[user]/www/resparke/apollo/apollo/frontend/[app]-app/.angular/cache/18.1.4/vite/deps/chunk-7QDQRZLY.js?v=a4c58b45:73:24)
    at _ZoneDelegate.invoke (http://apollo-app:4201/polyfills.js:300:158)
    at Object.onInvoke (http://apollo-app:4201/@fs/Users/[user]/www/[app]/apollo/apollo/frontend/[app]-app/.angular/cache/18.1.4/vite/deps/chunk-5BBVEAXN.js?v=a4c58b45:5812:25)
    at _ZoneDelegate.invoke (http://apollo-app:4201/polyfills.js:300:46)
    at _ZoneImpl.run (http://apollo-app:4201/polyfills.js:94:35)
    at http://apollo-app:4201/polyfills.js:2006:30
    at _ZoneDelegate.invokeTask (http://apollo-app:4201/polyfills.js:326:171)
image
mattiLeBlanc commented 4 weeks ago

Finally, is my issue happening because in the GraphQl headers I am swapping IdToken for AccessToken?


Amplify.configure(awsConfig.clientApp, {
  API: {
    GraphQL: {
      headers: async () => {
        try {
          const currentSession = await fetchAuthSession();
          if (currentSession.tokens) {
            const idToken = currentSession.tokens.idToken?.toString();
            return { Authorization: idToken };
          } else {
            signOut();
            return undefined
          }
        } catch (error) {
          signOut();
          return undefined;
        }
      }
    }
  }
});

Or does that have nothing to do with it?

mattiLeBlanc commented 4 weeks ago

TEMP SOLUTION!!!!

Amplify.configure(awsConfig.clientApp, {
  API: {
    GraphQL: {
      headers: async () => {
        try {
          let currentSession = await fetchAuthSession();
          const tokenExpiry = currentSession.tokens.accessToken.payload.exp * 1000;
          const HALF_MINUTE = 30000;
          const now = new Date().getTime() + HALF_MINUTE;
          if (tokenExpiry <= now) {
            currentSession = await fetchAuthSession({forceRefresh: true});
            console.log('============================= fetched a new token===================');
          }
          if (currentSession.tokens) {
            const idToken = currentSession.tokens.idToken?.toString();
            return { Authorization: idToken };
          } else {
            signOut();
            return undefined
          }
        } catch (error) {
          signOut();
          return undefined;
        }
      }
    }
  }
});

I am testing if a token is about to expire within a 1/2 minute and then force a refresh. This worked for me because I triggered my API just as the accessToken was about to expire and it fetched a new token and loaded the music. I need to further test this workaround but at least I got something. I would still like to see if there is a better solution. And rxjs retry didnt work for me, even not with a defer function around it.

cwomack commented 4 weeks ago

Hey, @mattiLeBlanc 👋. It looks like there this issue has many similar details captured within the existing #13541 that you opened. Is there a need to keep this issue open rather than follow up within that one? Let me know if I'm missing any specific differences here, but wanting to make sure we help you efficiently within a single issue so the responses/context isn't lost.

cwomack commented 4 weeks ago

@mattiLeBlanc, wanted to follow up and double check on what the use case is here for calling the API's directly within your Amplify.configure() call. Is the goal to proactively refresh the accessToken via the GraphQL headers? Any additional context would be helpful!

mattiLeBlanc commented 4 weeks ago

Hey, @mattiLeBlanc 👋. It looks like there this issue has many similar details captured within the existing #13541 that you opened. Is there a need to keep this issue open rather than follow up within that one? Let me know if I'm missing any specific differences here, but wanting to make sure we help you efficiently within a single issue so the responses/context isn't lost.

Hi @cwomack , I think this issue is more isolated now and accurate. Before I also had Federated JWT issue which I have resolved. Yesterday I decided to spend about half a day trying to get to the bottom of this unauthorised issue and report it in a clean story. I thought the other story may have snowed under and didn't get any attention. So I though a clean clear issue description with proof and a way to reproduce might be better. I think the other story can be closed or maybe we can merge them?

mattiLeBlanc commented 4 weeks ago

@mattiLeBlanc, wanted to follow up and double check on what the use case is here for calling the API's directly within your Amplify.configure() call. Is the goal to proactively refresh the accessToken via the GraphQL headers? Any additional context would be helpful!

Hi Chris, what I am doing in the headers is purely swap the accessToken for the idToken on each Graphql Request because I store custom data. This all works well.

Then yesterday I found out my accessToken doesn't get refreshed if a graphql query is executed right at the expiry moment of the token (it will use the expired token and result in unauthorised). I first tried to use defer and retry in my rxjs api call which converts the promise to a observable, to attempt to retry the query with hopefully a refreshed authorisation after the first error.

But that didn't seem to work, I just got multiple errors in my console and not multiple POSTs to my grahpl endpoint (on unauthorised and then the retries with a valid auth token).

SO, i thought how about force a refresh of the token if is about to expire. I chose the header a the location because that is executed for each graphql query or mutation and I already have the token information there. It looked like a nice place to do a force refresh.

Does that answer your question?

mattiLeBlanc commented 3 weeks ago

358970068-77dc0760-fabe-476c-b348-2aeac682efa1

I just got this error in my local development version (running on my mac). I checked the tokens and they have refreshed

image

but still the graphql endpoint is getting this unauthorised error, which breaks my flow.

chrisbonifacio commented 3 weeks ago

Hi @mattiLeBlanc 👋 thanks for opening this issue. At the moment your workaround of checking and force refreshing the tokens makes sense but there might be a race condition because even though you provide a token the graphql client internally performs its own fetchAuthSession call so we are discussing this internally as something that could use improvement.

That being said, I've marked this as a bug for us to track

mattiLeBlanc commented 3 weeks ago

Thank you.

mattiLeBlanc commented 3 weeks ago

Hi @mattiLeBlanc 👋 thanks for opening this issue. At the moment your workaround of checking and force refreshing the tokens makes sense but there might be a race condition because even though you provide a token the graphql client internally performs its own fetchAuthSession call so we are discussing this internally as something that could use improvement.

That being said, I've marked this as a bug for us to track

I think you are right. I saw another error of Unauthorised, which happened at 9:35 and the token expired at 9:34:

image

So the code I have that swaps accessToken for idToken in the graphql headers is doing that fetchAuthSession which probably then triggers the Unauthorised error.

If the solution could be that when fetchAuthSession is called and it has not token, it may check if it can refresh the token based on a valid refresh token before throwing the error?

thedevopsguyblog commented 1 week ago

Hi all, thanks for all the debugging so far! I seem to be having the same issue that was described in #13541 which is now closed in favor of this issue. I'm seeing the error Error Name: NoValidAuthTokens Error Msg: No federated jwt

It's not super clear to me if #13541 is tackling the error No federated jwt that is appearing exclusively on localhost (test and prod work fine).

I can see why the GQL mutations on localhost are failing - it's missing certain fields like "Authorization" in the request headers. I followed the docs and created a amplifyServerUtils file containing the cookieBasedClient. My server actions then leverage that for API calls.

export const cookieBasedClient = generateServerClientUsingCookies({
  config:{...awsExports},
  cookies,
});

The odd thing is if i don't make any queries from localhost for about ~12 hrs then attempt it again it'll work for ~5mins! then start throwing the jwt error. It seems like generateServerClientUsingCookies isn't working 100% on localhost, DEV/TEST and PROD work fine.

Here are the headers, localhost vs develop. Request Header Localhost

POST /playground HTTP/1.1
Host: localhost:3000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:129.0) Gecko/20100101 Firefox/129.0
Accept: text/x-component
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br, zstd
Referer: http://localhost:3000/playground
Next-Action: 66296251718b88f81bc71510fbdbc3278914bc82
Next-Router-State-Tree: %5B%22%22%2C%7B%22children%22%3A%5B%22playground%22%2C%7B%22children%22%3A%5B%22__PAGE__%22%2C%7B%7D%2C%22%2Fplayground%22%2C%22refresh%22%5D%7D%5D%7D%2Cnull%2Cnull%2Ctrue%5D
Content-Type: multipart/form-data; boundary=---------------------------168850550536958586124075283889
Content-Length: 819
Origin: http://localhost:3000
Connection: keep-alive
Cookie: CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.LastAuthUser=username@gmail.com; 
        CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.username%40gmail.com.accessToken=xxx
        CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.username%40gmail.com.idToken=xxx
        CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.username%40gmail.com.refreshToken=xxx
        CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.username%40gmail.com.signInDetails={%22loginId%22:%22username@gmail.com%22%2C%22authFlowType%22:%22USER_SRP_AUTH%22}; 
        CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.username%40gmail.com.clockDrift=-772
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-origin
Priority: u=0

Request Header Develop

POST /playground HTTP/2
Host: develop.d1wp9ljz13k7ch.amplifyapp.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:129.0) Gecko/20100101 Firefox/129.0
Accept: text/x-component
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br, zstd
Referer: https://develop.d1wp9ljz13k7ch.amplifyapp.com/playground
Next-Action: a1defeb1166b15ac8dd3085b8e9f4e5e88423a01
Next-Router-State-Tree: %5B%22%22%2C%7B%22children%22%3A%5B%22playground%22%2C%7B%22children%22%3A%5B%22__PAGE__%22%2C%7B%7D%2C%22%2Fplayground%22%2C%22refresh%22%5D%7D%5D%7D%2Cnull%2Cnull%2Ctrue%5D
Content-Type: multipart/form-data; boundary=---------------------------108270092293532131151870777
Content-Length: 605
Origin: https://develop.d1wp9ljz13k7ch.amplifyapp.com
Authorization: Basic YWRtaW46RzdaQ25wOGg=
Connection: keep-alive
Cookie: CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.LastAuthUser=username@gmail.com; 
        CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.username%40gmail.com.accessToken=xxx
        CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.username%40gmail.com.idToken=xxx
        CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.username%40gmail.com.refreshToken=xxx
        CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.username%40gmail.com.signInDetails={%22loginId%22:%22username@gmail.com%22%2C%22authFlowType%22:%22USER_SRP_AUTH%22}; 
        CognitoIdentityServiceProvider.3asq47ba2k87lvd00etkfg3kj8.username%40gmail.com.clockDrift=-237
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-origin
Sec-GPC: 1
Priority: u=0

I hope all that makes sense, happy to open a new issues if need be. Of note, localhost and develop share the same infrastructure via the aws-exports.ts file (appsync and cognito) but i don't think that should make a difference.

HuiSF commented 1 week ago

Hi @thedevopsguyblog at a glance I don't think the issue you encountered is unrelated to this one.

Were the request headers you pasted were from the requests sent to your server action? (I presume so as it showed context type as form-data). If so, the Authorization header was not controlled Amplify.

In addition, when you get the error such as Error Name: NoValidAuthTokens or Error Msg: No federated jwt could you use debug tool to inspect the error object you caught? It contains an underlyingError field which gives what else went wrong during this process. That may help determine how to resolve your issue.

thedevopsguyblog commented 2 days ago

Hi @HuiSF, sorry for the delay, went to Germany!

Here are some basic troubleshooting steps I took...

  1. deleted node_modules and .next
  2. deleted all cookies and session storage
  3. ran npm i and lsof -i tcp:3000 | grep 'node' | awk '{print $2}' | xargs kill -9 && npm run dev
  4. signed in again
  5. submitted my form to my serverAction with the following response....
    {
    "name": "NoSignedUser",
    "underlyingError": {
    "name": "NotAuthorizedException",
    "$metadata": {
      "attempts": 1,
      "httpStatusCode": 400,
      "requestId": "abce83d8-35bd-4d69-9e0b-3c8b38e32fa7"
    }
    },
    "recoverySuggestion": "Review the underlying exception field for more details. If you intended to make an authenticated API request, review if the current user is signed in."
    }

Interesting point, The client side S3 storage API uploadData with the ID in the path field works...which is odd right - cause how is the API determining my users ID?

const uploadFiles = files.map(async (f) => {
try {
await uploadData({ data: f, path:(identityId) => `protected/${identityId.identityId}/${formattedDate}/${f.name}`}).result.then((uploadedFile) => {
     fileUploadResults.push({ name: uploadedFile.path, etag: uploadedFile.eTag! })
     }
)} catch (error) {logger('REQ', 'UP', `Error:\n ${error}\nUploading file: ${f.name}`, 'error')}
})

I also seem to be experincing this issue in my amplify hosted app which is concerning as before this it was isolated to http://localhost:3000/ so I'm unsure how to proceed.

I'll try digging into cloudwatch logs to see if the error message is the same as local development.

HuiSF commented 2 days ago

Hi @thedevopsguyblog when you call Amplify.configure() on the client-side of your app, have you specified { ssr: true }? From your description it sounded like the Amplify client-side functionalities are working after signing in but not the server-side functionalities, it may be due to that your end user auth tokens are bring written in localStorage. Can you verify?

thedevopsguyblog commented 2 days ago

Hi @thedevopsguyblog when you call Amplify.configure() on the client-side of your app, have you specified { ssr: true }? From your description it sounded like the Amplify client-side functionalities are working after signing in but not the server-side functionalities, it may be due to that your end user auth tokens are bring written in localStorage. Can you verify?

Yep, I've got this...

Amplify.configure({...awsExports}, { ssr: true });

And it's defined in my root layout file.

mattiLeBlanc commented 2 days ago

Hi @mattiLeBlanc 👋 thanks for opening this issue. At the moment your workaround of checking and force refreshing the tokens makes sense but there might be a race condition because even though you provide a token the graphql client internally performs its own fetchAuthSession call so we are discussing this internally as something that could use improvement. That being said, I've marked this as a bug for us to track

I think you are right. I saw another error of Unauthorised, which happened at 9:35 and the token expired at 9:34: image So the code I have that swaps accessToken for idToken in the graphql headers is doing that fetchAuthSession which probably then triggers the Unauthorised error.

If the solution could be that when fetchAuthSession is called and it has not token, it may check if it can refresh the token based on a valid refresh token before throwing the error?

@HuiSF Any update on this?