empiricaly / empirica

Open source project to tackle the problem of long development cycles required to produce software to conduct multi-participant and real-time human experiments online.
https://empirica.ly/
Apache License 2.0
44 stars 8 forks source link

potential deadlock at high load #497

Closed hawkrobe closed 6 months ago

hawkrobe commented 6 months ago

Is there an existing issue for this?

What happened?

at high load (approximately 100 participants simultaneously), i got a POTENTIAL DEADLOCK message in the server log followed by a Error: connect ECONNREFUSED ::1:3000 error.

Steps To Reproduce

  1. Run this experiment
  2. Recruit 100 participants
  3. Wait a few minutes after the games start to fill up

Empirica Version

Version: v1.9.3
SHA:     0721d5b
Build:   176
Branch:  main
Time:    2024-01-21T08:00:13Z

Client:  1.9.3 -> 1.9.3 ->
Server:  1.9.3 -> 1.9.3 ->

What OS are you seeing the problem on?

Linux

What browser are you seeing the problem on?

Does not apply

Relevant log output

POTENTIAL DEADLOCK:
Previous place where the lock was grabbed
goroutine 157612 lock 0xc000523ae0
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/runtime/step.go:56 runtime.(*Runtime).Transition ??? <<<<<
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/runtime/step.go:55 runtime.(*Runtime).Transition ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/transition.resolvers.go:19 graph.(*mutationResolver).Transition ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:6083 graph.(*executionContext)._Mutation_transition.func2.1 ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/resolver.go:57 tajriba.Setup.NewSchema.func1 ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:6093 graph.(*executionContext)._Mutation_transition.func2.2 ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:6097 graph.(*executionContext)._Mutation_transition.func2 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/extensions.go:71 executor.processExtensions.func4 ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:6080 graph.(*executionContext)._Mutation_transition ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:14652 graph.(*executionContext)._Mutation.func8 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/extensions.go:68 executor.processExtensions.func3 ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:14651 graph.(*executionContext)._Mutation ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:1727 graph.(*executableSchema).Exec.func2 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/executor.go:119 executor.(*Executor).DispatchOperation.func1.1.1 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/extensions.go:65 executor.processExtensions.func2 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/executor.go:126 executor.(*Executor).DispatchOperation.func1.1 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/handler/transport/http_post.go:90 transport.POST.Do ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/handler/server.go:121 handler.(*Server).ServeHTTP ???
/go/pkg/mod/github.com/rs/zerolog@v1.27.0/log/log.go:129 log.Ctx ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/auth/authhttp/middleware.go:37 server.Enable.Middleware.func1 ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/server/server.go:147 server.Enable.requestMetadata.func2 ???
/go/pkg/mod/github.com/julienschmidt/httprouter@v1.3.0/router.go:387 httprouter.(*Router).ServeHTTP ???
/go/pkg/mod/github.com/rs/cors@v1.8.2/cors.go:232 server.(*Server).Start.(*Cors).Handler.func4 ???
/usr/local/go/src/net/http/server.go:2136 http.HandlerFunc.ServeHTTP ???
/usr/local/go/src/net/http/server.go:2938 http.serverHandler.ServeHTTP ???
/usr/local/go/src/net/http/server.go:2009 http.(*conn).serve ???

Have been trying to lock it again for more than 1m0s
goroutine 272276 lock 0xc000523ae0
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/runtime/attribute.go:26 runtime.(*Runtime).SetAttributes ??? <<<<<
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/runtime/attribute.go:25 runtime.(*Runtime).SetAttributes ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/attribute.resolvers.go:67 graph.(*mutationResolver).SetAttributes ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:5567 graph.(*executionContext)._Mutation_setAttributes.func2 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/extensions.go:71 executor.processExtensions.func4 ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:5569 graph.(*executionContext)._Mutation_setAttributes ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:14603 graph.(*executionContext)._Mutation.func1 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/extensions.go:68 executor.processExtensions.func3 ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:14602 graph.(*executionContext)._Mutation ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:1727 graph.(*executableSchema).Exec.func2 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/executor.go:119 executor.(*Executor).DispatchOperation.func1.1.1 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/extensions.go:65 executor.processExtensions.func2 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/executor.go:126 executor.(*Executor).DispatchOperation.func1.1 ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/handler/transport/http_post.go:90 transport.POST.Do ???
/go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/handler/server.go:121 handler.(*Server).ServeHTTP ???
/go/pkg/mod/github.com/rs/zerolog@v1.27.0/log/log.go:129 log.Ctx ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/auth/authhttp/middleware.go:37 server.Enable.Middleware.func1 ???
/go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/server/server.go:147 server.Enable.requestMetadata.func2 ???
/go/pkg/mod/github.com/julienschmidt/httprouter@v1.3.0/router.go:387 httprouter.(*Router).ServeHTTP ???
/go/pkg/mod/github.com/rs/cors@v1.8.2/cors.go:232 server.(*Server).Start.(*Cors).Handler.func4 ???
/usr/local/go/src/net/http/server.go:2136 http.HandlerFunc.ServeHTTP ???
/usr/local/go/src/net/http/server.go:2938 http.serverHandler.ServeHTTP ???
/usr/local/go/src/net/http/server.go:2009 http.(*conn).serve ???

Here is what goroutine 157612 doing now
goroutine 157612 [chan receive (nil chan), 1 minutes]:
github.com/empiricaly/tajriba/internal/runtime.(*Runtime).stopStep(0xc8ff40?, {0xc000402c90?, 0xc0084c9aa0?}, {0xc00402eb60?, 0xcb81c0?})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/runtime/step.go:280 +0x5b
github.com/empiricaly/tajriba/internal/runtime.(*Runtime).Transition(0xc000523aa0, {0xea4590, 0xc005bb8090}, {0xc008313e60, 0x1a}, {0xc008640760, 0x7}, {0xc00864076a, 0x5}, 0xc006320a50)
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/runtime/step.go:167 +0x1045
github.com/empiricaly/tajriba/internal/graph.(*mutationResolver).Transition(0xc00183eea8?, {0xea4590, 0xc005bb8090}, {{0xc008313e60, 0x1a}, {0xc008640760, 0x7}, {0xc00864076a, 0x5}, 0xc006320a50})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/transition.resolvers.go:19 +0x93
github.com/empiricaly/tajriba/internal/graph.(*executionContext)._Mutation_transition.func2.1({0xea4590?, 0xc005bb8090})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:6084 +0x14f
github.com/empiricaly/tajriba.Setup.NewSchema.func1({0xea4590, 0xc005bb8090}, {0xc005bb8090?, 0xc49fa0?}, 0xc001f40e00, {0xd68095, 0x5})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/resolver.go:58 +0x109
github.com/empiricaly/tajriba/internal/graph.(*executionContext)._Mutation_transition.func2.2({0xea4590, 0xc005bb8090})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:6094 +0x84
github.com/empiricaly/tajriba/internal/graph.(*executionContext)._Mutation_transition.func2({0xea4590, 0xc005bb8090})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:6097 +0xd3
github.com/99designs/gqlgen/graphql/executor.processExtensions.func4({0xea4590?, 0xc005bb8090?}, 0xc0087580e0?)
        /go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/extensions.go:72 +0x22
github.com/empiricaly/tajriba/internal/graph.(*executionContext)._Mutation_transition(0xc001f40c80, {0xea4590, 0xc005b53ec0}, {0xc006a6da70, {0xc0063209b0, 0x1, 0x1}, 0x0})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:6081 +0x1ff
github.com/empiricaly/tajriba/internal/graph.(*executionContext)._Mutation.func8({0xea4590?, 0xc005b53ec0?})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:14653 +0x45
github.com/99designs/gqlgen/graphql/executor.processExtensions.func3({0xea4590?, 0xc005b53ec0?}, 0xc4afe0?)
        /go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/extensions.go:69 +0x22
github.com/empiricaly/tajriba/internal/graph.(*executionContext)._Mutation(0xc001f40c80, {0xea4590, 0xc005b53e00}, {0xc006320280, 0x1, 0x1})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:14652 +0xd4e
github.com/empiricaly/tajriba/internal/graph.(*executableSchema).Exec.func2({0xea4590?, 0xc005b53dd0?})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/graph/generated.go:1726 +0x8a
github.com/99designs/gqlgen/graphql/executor.(*Executor).DispatchOperation.func1.1.1({0xea4590, 0xc005b53dd0})
        /go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/executor.go:119 +0x25
github.com/99designs/gqlgen/graphql/executor.processExtensions.func2({0xea4590?, 0xc005b53dd0?}, 0xc4afe0?)
        /go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/extensions.go:66 +0x22
github.com/99designs/gqlgen/graphql/executor.(*Executor).DispatchOperation.func1.1({0xea4590, 0xc005b53d10})
        /go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/executor/executor.go:118 +0x104
github.com/99designs/gqlgen/graphql/handler/transport.POST.Do({0xea4590?}, {0xea2610?, 0xc005fbf420}, 0xc00870c400, {0xea42b0, 0xc0000b19a0})
        /go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/handler/transport/http_post.go:91 +0x556
github.com/99designs/gqlgen/graphql/handler.(*Server).ServeHTTP(0xc00048f840, {0xea2610?, 0xc005fbf420}, 0xc00870c400)
        /go/pkg/mod/github.com/99designs/gqlgen@v0.17.36/graphql/handler/server.go:121 +0x278
github.com/empiricaly/tajriba/internal/server.graphqlHandler.func7({0xea2610, 0xc005fbf420}, 0xc005b52b70?, {0xc0005319a0?, 0xd4c900?, 0xa1f901?})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/server/graphql.go:230 +0x70
github.com/empiricaly/tajriba/internal/server.Enable.Middleware.func1({0xea2610, 0xc005fbf420}, 0xc00870c200, {0x0, 0x0, 0x0})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/auth/authhttp/middleware.go:37 +0x18e
github.com/empiricaly/tajriba/internal/server.Enable.requestMetadata.func2({0xea2610, 0xc005fbf420}, 0xc00870c100, {0x0, 0x0, 0x0})
        /go/pkg/mod/github.com/empiricaly/tajriba@v1.6.0/internal/server/server.go:147 +0xda
github.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc0005238c0, {0xea2610, 0xc005fbf420}, 0xc00870c100)
        /go/pkg/mod/github.com/julienschmidt/httprouter@v1.3.0/router.go:387 +0x7eb
github.com/empiricaly/empirica/internal/server.(*Server).Start.(*Cors).Handler.func4({0xea2610, 0xc005fbf420}, 0xc00870c100)
        /go/pkg/mod/github.com/rs/cors@v1.8.2/cors.go:231 +0x184
net/http.HandlerFunc.ServeHTTP(0x4123a5?, {0xea2610?, 0xc005fbf420?}, 0xc005fbf401?)
        /usr/local/go/src/net/http/server.go:2136 +0x29
net/http.serverHandler.ServeHTTP({0xea01a0?}, {0xea2610?, 0xc005fbf420?}, 0x6?)
        /usr/local/go/src/net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc006f6e990, {0xea4590, 0xc000480ba0})
        /usr/local/go/src/net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 70
        /usr/local/go/src/net/http/server.go:3086 +0x5cb

14:21:51.023 WRN failed load: [Network] fetch failed
hawkrobe@ubuntu-s-2vcpu-4gb-amd-sfo3-01-1705529506558-s-4vcpu-16gb-amd-sf:~/social-media-dynamics/experiments$ 14:21:51.035 WRN failed load: [Network] fetch failed
14:21:51.040 ERR failed to create steps: [Network] fetch failed
14:21:51.044 WRN failed load: [Network] fetch failed
14:21:51.046 ERR failed to create steps: [Network] fetch failed
14:21:51.050 WRN failed load: [Network] fetch failed
14:21:51.053 ERR failed to create steps: [Network] fetch failed
14:21:51.059 ERR failed to create steps: [Network] fetch failed
14:21:51.060 WRN failed load: [Network] fetch failed
14:21:51.061 WRN failed load: [Network] fetch failed
14:21:51.062 WRN failed load: [Network] fetch failed
14:21:51.063 WRN failed load: [Network] fetch failed
14:21:51.063 WRN failed load: [Network] fetch failed
14:21:51.064 WRN failed load: [Network] fetch failed
14:21:51.067 WRN failed load: [Network] fetch failed
14:21:51.067 WRN failed load: [Network] fetch failed
14:21:51.067 WRN failed load: [Network] fetch failed
14:21:51.071 ERR failed to create steps: [Network] fetch failed
14:21:51.071 WRN failed load: [Network] fetch failed
14:21:51.078 WRN failed load: [Network] fetch failed
14:21:51.078 WRN failed load: [Network] fetch failed
14:21:51.080 ERR failed to create steps: [Network] fetch failed
14:21:51.081 WRN failed load: [Network] fetch failed
14:21:51.082 WRN failed load: [Network] fetch failed
14:21:51.082 WRN failed load: [Network] fetch failed
14:21:51.083 WRN failed load: [Network] fetch failed
14:21:51.083 WRN failed load: [Network] fetch failed
14:21:51.084 WRN failed load: [Network] fetch failed
14:21:51.085 WRN failed load: [Network] fetch failed
14:21:51.086 ERR failed to create steps: [Network] fetch failed
14:21:51.086 WRN failed load: [Network] fetch failed
14:21:51.086 WRN failed load: [Network] fetch failed
14:21:51.087 WRN failed load: [Network] fetch failed
14:21:51.087 WRN failed load: [Network] fetch failed
14:21:51.088 ERR failed to create steps: [Network] fetch failed
14:21:51.092 WRN failed load: [Network] fetch failed
14:21:51.093 WRN failed load: [Network] fetch failed
14:21:51.093 WRN failed load: [Network] fetch failed
14:21:51.094 WRN failed load: [Network] fetch failed
14:21:51.094 ERR failed to create steps: [Network] fetch failed
14:21:51.095 WRN failed load: [Network] fetch failed
14:21:51.095 WRN failed load: [Network] fetch failed
14:21:51.096 WRN failed load: [Network] fetch failed
14:21:51.096 WRN failed load: [Network] fetch failed
14:21:51.096 WRN failed load: [Network] fetch failed
14:21:51.116 WRN failed load: [Network] fetch failed
14:21:51.118 WRN failed load: [Network] fetch failed
14:21:51.119 WRN failed load: [Network] fetch failed
14:21:51.120 WRN failed load: [Network] fetch failed
14:21:51.121 WRN failed load: [Network] fetch failed
14:21:51.122 WRN failed load: [Network] fetch failed
14:21:51.123 WRN failed load: [Network] fetch failed
14:21:51.124 WRN failed load: [Network] fetch failed
14:21:51.124 WRN failed load: [Network] fetch failed
14:21:51.125 WRN failed load: [Network] fetch failed

      _eventsCount: 4,
      _maxListeners: undefined,
      _binaryType: 'nodebuffer',
      _closeCode: 1006,
      _closeFrameReceived: false,
      _closeFrameSent: false,
      _closeMessage: Buffer(0) [Uint8Array] [],
      _closeTimer: null,
      _extensions: {},
      _paused: false,
      _protocol: '',
      _readyState: 2,
      _receiver: null,
      _sender: null,
      _socket: null,
      _bufferedAmount: 0,
      _isServer: false,
      _redirects: 0,
      _autoPong: true,
      _url: 'ws://localhost:3000/query',
      _req: null,
      [Symbol(kCapture)]: false
    },
    [Symbol(kType)]: 'error',
    [Symbol(kError)]: AggregateError
        at __node_internal_ (node:internal/errors:174:15)
        at internalConnectMultiple (node:net:1114:18)
        at afterConnectMultiple (node:net:1667:5) {
      code: 'ECONNREFUSED',
      [errors]: [
        Error: connect ECONNREFUSED ::1:3000
            at __node_internal_captureLargerStackTrace (node:internal/errors:563:5)
            at __node_internal_exceptionWithHostPort (node:internal/errors:738:12)
            at createConnectionError (node:net:1634:14)
            at afterConnectMultiple (node:net:1664:40) {
          errno: -111,
          code: 'ECONNREFUSED',
          syscall: 'connect',
          address: '::1',
          port: 3000
        },
        Error: connect ECONNREFUSED 127.0.0.1:3000
            at __node_internal_captureLargerStackTrace (node:internal/errors:563:5)
            at __node_internal_exceptionWithHostPort (node:internal/errors:738:12)
            at createConnectionError (node:net:1634:14)
            at afterConnectMultiple (node:net:1664:40) {
          errno: -111,
          code: 'ECONNREFUSED',
          syscall: 'connect',
          address: '127.0.0.1',
          port: 3000
        }
      ]
    },
    [Symbol(kMessage)]: ''
  }
}

Node.js v20.10.0
14:21:53.682 FTL empirica: failed to start error="npm run --silent serve -- --token LLMuqqxhpkLDnnau --url http://localhost:3000/query --sessionTokenPath /home/hawkrobe/social-media-dynamics/experiments/.empirica/local/callBackSessionToken: exit status 1"

Anything else?

No response

Code of Conduct

npaton commented 6 months ago

This should be fixed in the latest release (1.9.4).