UQComputingSociety / uqcsbot-discord

:mortar_board: UQCSbot: Our friendly little Discord bot
https://discord.uqcs.org
MIT License
20 stars 20 forks source link

Generic input sanitisation functions #120

Open jenseni-git opened 1 year ago

jenseni-git commented 1 year ago

As commented in PR #117, would be nice to have some generic input sanitisation functions to call that are centrally managed, and then implemented bot wide. This would ensure that all sanitisation is consistent and if bugs are found, allow them to be patched all at once.

Notes on ideas to include:

I guess a note on all of these that obviously the user put these in their message for a reason. Maybe it's just to try to mess with the bot on purpose, or maybe it's to include some text formatting in their respective output, however, consideration for all of these functions, if possible, should be given to attempting to preserve the user's formatting if it doesn't break things. If these characters do break things, attempts to replace them at the end would be good.

For a start though, generic, bot-wide sanitisation would be good.

After these are created, they would obviously also need to be actually implemented bot-wide.

49Indium commented 1 year ago

I'm just going to link #100 here, as it seems to be a subset of this issue. We might need to remove #100.

andrewj-brown commented 11 months ago

This is the sort of trick you could handle with a decorator (and it would be pretty cool to do so). Take a function, find every str argument, then call the original function with sanitise() on all of them.

This wouldn't handle in-depth sanitisation (e.g. when you're pulling message.text from an interaction) but it would make base-level sanitisation very easy. For in-depth, you'd still just have to sanitise() the string.

However, that's an implementation detail that I'll leave to whoever picks up this issue. For now, I'm agreeing with Isaac that #100 is a sufficiently narrow subset and closing it.

Quoth Isaac on that PR:

Wow, if I had a nickel for every time I [found a regex for replacing discord emotes] ... I'd have two[three] nickels - which isn't a lot, but it's weird that it happened twice[thrice].

(and also noting that the 3 he identified there are in haiku.py, yelling.py, and cowsay.py)