tzhf / chatguessr

A Twitch chatbot for GeoGuessr.
https://chatguessr.com
MIT License
38 stars 9 forks source link

Security issue: OAuth token sent to server #28

Closed pengi closed 2 years ago

pengi commented 2 years ago

When a map is complete, the result is uploaded to the server to provide a list of results. The request contains:

But also:

That means the bot OAuth token, which is for security to handle as a password, is sent to the server so the server have access to several functions as the bot user. And to protect the streamer, that should never be sent to a third party (such as chatguessr servers, but kept within the client)

This is added in commit f20facc3057da2604e5b83dea6ed7c19ef4a1f7c

For what I can tell, the reason for this change is to authenticate the data for the result listing. Which I see two problems with:

  1. The result listing is a single request providing a response containing the game code, that the bot then posts to chat. Thus, to forge a bad listing for the chat, also the bot account needs to be compromised to post the code in the chat itself. Otherwise, it will just be a random link.
  2. The bot name and OAuth is not presented in the status listing itself, so faking all information but the bot account will still make a status listing that looks valid, even with faked data.

An proof-of-concept of a fake status listing (from valid bot oauth) can be found here:

https://chatguessr.com/game/ovapudocep

Therefore, I suggest the OAuth token simply is not sent to the server, since it doesn't give protection about fake listings, and due to point 1 above, it's ok.

If it would be for validating for high score lists, or competitions, require the VOD from the stream with chat replay, or in advance let the ones managing the competition join the chat. In that way, the game code would be authenticated from chat.

ReAnnannanna commented 2 years ago

Thanks for writing this up! The reason this was done for the games list endpoint was not to ensure the data is real, but because the server was being spammed and running into database limits. We needed a quick fix that would make this harder or atleast allow us to ban bad actors easily. A longer term fix is to implement proper rate limiting and then there will be no use in sending this anymore.

There is a second place where we send the token, when authenticating the streamer to the socket server. When we address this issue, the games list will be addressed "for free", so this is where the focus is.

I think this comes down to overreaching scopes. So far we have relied on the twitchapps.com service, which is actually intended for IRC users. The tokens generated by twitchapps can read chat and whispers, send chat and whispers, but also moderate the chat and edit the stream title. Those latter things are unnecessary for CG. This is also the extent of the impact: there is no possibility of a full account takeover. Obviously it is still a lot of access and a serious issue if it does fall in the wrong hands (especially especially for people who use their main account).

Our intended approach ATM is to build Twitch login into CG so that we can ask for fewer permissions, and then also downgrade tokens on the client-side before sending them to the server. Then the tokens will not have any scopes and it is physically impossible for the server to use them for anything other than verifying identity. Especially the guess server must necessarily have some twitch authentication, which inherently requires some OAuth token. However the servers need no permissions, so the token we send should have no permissions, to make it impossible for both us or a hacker to use it in unintended ways.

I'd love to know if you would still have concerns with that :)

pengi commented 2 years ago

Thanks for the good explanation. :)

The database load issue is understandable, unfortunately. And agree, a rate limit would make sense, for example per source IP. No chance that same IP finishes more than one game every 10 seconds. So a proper rate limit would seem better than sending authenetication data.

I missed the socket server when looking though the source too, and thought you where going for the IRC interface immediately. But providing the OAuth token in the path of authentication for where authentication is needed is not that bad. Not worse than sending it to a webpage hosting a cloud based bot.

After that explaination, my biggest surprise is that chatguessr actually has an application running per game server side. I thought it was a stand alone application running on the users computer, authenticating only to twitch (and geoguessr), and just used the server side instead of pastebin for displaying results.

I'm happy with your explanation, and gives me better understanding to what happens. I also like the long term twitch authentication, which also means it would be possible to manage chatguessr separately under twitch connections.

I think the only thing I'm missing then is something like a tool tip under "Twitch Connect" tab explaining that the authentication data is sent encrypted via chatguessrs servers, which for some (for example me) matters when want to know who have a chance to intercept authentication data, and thus who to trust.