R2Northstar / NorthstarMasterServer

Master server for Northstar
MIT License
92 stars 33 forks source link

Investigate frequent MasterServer authentication outages #76

Closed GeckoEidechse closed 1 year ago

GeckoEidechse commented 2 years ago

For some reason in recent time master server started requiring manual restarts every 2-3 days due to starting to through auth errors on clients. Issue is currently not known and needs investigating.

ASpoonPlaysGames commented 2 years ago

The error clients get is the Invalid/Expired Masterserver Token

ASpoonPlaysGames commented 2 years ago

More details on the errors that we get [error] Northstar origin authentication failed [error] {"success":false,"error":{"enum":"STRYDER_RESPONSE","msg":"Couldn't parse stryder response"}} [error] Failed reading masterserver response: got fastify error response [error] {"success":false,"error":{"enum":"INVALID_MASTERSERVER_TOKEN","msg":"Invalid or expired masterserver token"}}

ASpoonPlaysGames commented 2 years ago

The error seems to be happening in one of two places: https://github.com/R2Northstar/NorthstarMasterServer/blob/main/client/clientauth.js#L64 or https://github.com/R2Northstar/NorthstarMasterServer/blob/main/client/clientauth.js#L74

So we are either failing to parse the JSON that we get back from stryder (unlikely imo) or we are getting an error back from stryder. Potentially a ratelimit of some kind?

uniboi commented 2 years ago

Just don't confirm ownership with stryder :trollface:

wolf109909 commented 2 years ago

Could be. i stand on stryder rate limit too. It's most likely to happen when I just push a update to northstarCN, and around 100Players are trying to get auth from masterserver during a short period of time.

GeckoEidechse commented 2 years ago

This would go inside with it happening less often currently where our playerbase is on the smaller end.

(sidenote, it's shrinking roughly at the same rate as vanilla Steam playerbase post recent summer sale so not really a reason to worry)

ASpoonPlaysGames commented 2 years ago

Since my PR has been merged, the error I am getting has changed from

[error] Northstar origin authentication failed [error] {"success":false,"error":{"enum":"STRYDER_RESPONSE","msg":"Couldn't parse stryder response"}}

to

[error] Failed reading origin auth info response: malformed response object {"statusCode":500,"error":"Internal Server Error","message":"Cannot read properties of undefined (reading 'toString')"}

The only uses of toString that were added mean that the error is one of the two options:

  1. no response from stryder, making authResponse undefined, which then makes the returning of the authResponse throw an exception
  2. an exception being thrown in the asyncHttp request, also causing authResponse to be undefined, with the same result

I'm going to make a PR to add some more checks, so that I can further narrow down this issue

Erlite commented 2 years ago

If it's any help, I also got a 500 by passing in a + through the Origin token, while trying to see if I could inject other parameters to trick Stryder into validating my token & UID while passing a different UID to the master server.

pg9182 commented 1 year ago

No issues for the last month after switching to Atlas.