empiricaly / empirica

Open source project to tackle the problem of long development cycles required to produce software to conduct multi-participant and real-time human experiments online.
https://empirica.ly/
Apache License 2.0
44 stars 8 forks source link

transition step: invalid state #454

Closed npaton closed 9 months ago

npaton commented 9 months ago

Is there an existing issue for this?

What happened?

Occasionally, failed load: [GraphQL] transition step: invalid state is thrown as an unhandled promise rejection.

Steps To Reproduce

Unclear when this happens.

Empirica Version

1.8.9

What OS are you seeing the problem on?

No response

What browser are you seeing the problem on?

No response

Relevant log output

19:40:00.129 WRN failed load: [GraphQL] transition step: invalid state
Unhandled Promise Rejection. Reason:  [t [CombinedError]: [GraphQL] transition step: invalid state] {
  graphQLErrors: [
    Wi [GraphQLError]: transition step: invalid state
        at fV (/tmp/43a21baa34952e15/node_modules/@urql/core/src/utils/error.ts:24:10)
        at Array.map (<anonymous>)
        at t (/tmp/43a21baa34952e15/node_modules/@urql/core/src/utils/error.ts:61:5)
        at Zc (/tmp/43a21baa34952e15/node_modules/@urql/core/src/utils/result.ts:22:15)
        at Object.next (/tmp/43a21baa34952e15/node_modules/@urql/core/src/exchanges/subscription.ts:82:15)
        at Object.1e121d09-9817-4569-91ef-0c66097149c8 (/tmp/43a21baa34952e15/node_modules/graphql-ws/lib/client.mjs:361:42)
        at emit (/tmp/43a21baa34952e15/node_modules/graphql-ws/lib/client.mjs:77:94)
        at Object.emit (/tmp/43a21baa34952e15/node_modules/graphql-ws/lib/client.mjs:104:21)
        at Ve.ue.onmessage (/tmp/43a21baa34952e15/node_modules/graphql-ws/lib/client.mjs:200:29)
        at Kh (/tmp/43a21baa34952e15/node_modules/ws/lib/event-target.js:290:14) {
      path: [Array],
      locations: undefined,
      extensions: {}
    }
  ],
  networkError: undefined,
  response: undefined
}
19:40:00.370 WRN failed load: [GraphQL] transition step: invalid state
Unhandled Promise Rejection. Reason:  [t [CombinedError]: [GraphQL] transition step: invalid state] {
  graphQLErrors: [
    Wi [GraphQLError]: transition step: invalid state
        at fV (/tmp/43a21baa34952e15/node_modules/@urql/core/src/utils/error.ts:24:10)
        at Array.map (<anonymous>)
        at t (/tmp/43a21baa34952e15/node_modules/@urql/core/src/utils/error.ts:61:5)
        at Zc (/tmp/43a21baa34952e15/node_modules/@urql/core/src/utils/result.ts:22:15)
        at Object.next (/tmp/43a21baa34952e15/node_modules/@urql/core/src/exchanges/subscription.ts:82:15)
        at Object.adf644de-6534-4951-a73d-af0bd8a197f1 (/tmp/43a21baa34952e15/node_modules/graphql-ws/lib/client.mjs:361:42)
        at emit (/tmp/43a21baa34952e15/node_modules/graphql-ws/lib/client.mjs:77:94)
        at Object.emit (/tmp/43a21baa34952e15/node_modules/graphql-ws/lib/client.mjs:104:21)
        at Ve.ue.onmessage (/tmp/43a21baa34952e15/node_modules/graphql-ws/lib/client.mjs:200:29)
        at Kh (/tmp/43a21baa34952e15/node_modules/ws/lib/event-target.js:290:14) {
      path: [Array],
      locations: undefined,
      extensions: {}
    }
  ],
  networkError: undefined,
  response: undefined
}


### Anything else?

_No response_

### Code of Conduct

- [X] I agree to follow this project's Code of Conduct
npaton commented 9 months ago

@aliciamchen I've created this new ticket to track your issue. I've found where it's not being handled properly. The rejection itself might be harmless, but not handling might be breaking other state. So I'm adding code to handle the error gracefully, and extra logs to see if the error is actually a problem or not.

When you see this error, does everything break, or does the experiment keep on going?

aliciamchen commented 9 months ago

Thank you! Sometimes it breaks but most of the time it keeps on going. Also, sometimes I see the error very frequently (like every minute) and sometimes I don't see it frequently at all.

npaton commented 9 months ago

Thank you for the details!

What I think it might be is a race condition between the natural end of the stage (the stage times out) and all players submitting the stage. Would that make sense in your case? Meaning:

  1. do you allow your players to submit on stages
  2. do you sometimes submit toward the end of the stage timer (while testing, when it's players, you can't really see that of course)

There are a couple of other potential cases where this might happen. In any case, this type of scenario was not well handled. I am fixing that now, and I am adding copious logs to be able to see the issue if it appears again.

npaton commented 9 months ago

Actually, I can see that you do have submit on your stages. As for the other condition, I can imagine it happening. So this generally does make sense. Either way, I'm locking all this down, so it shouldn't be a problem again.

npaton commented 9 months ago

I released a new version 1.8.10 with the fix. I couldn't easily reproduce it, but I think it should work. Please reopen this ticket if you're still seeing this transition step: invalid state issue.

aliciamchen commented 9 months ago

Thank you! I updated to 1.8.10 and haven't been getting the transition step: invalid state warning anymore. One thing that it did give me a few times, however, that I haven't seen before, is: 21:47:38.630 WRN failed load: [GraphQL] transition step: invalid transition: from state mismatch with previous state

I've also been getting conn write timeout and write buffer full, abandoning connection, but it doesn't seem to crash the experiment.

aliciamchen commented 9 months ago

With the new version, I ended up getting a different error that crashed the experiment: #460