UQComputingSociety / uqcsbot-discord

:mortar_board: UQCSbot: Our friendly little Discord bot
https://discord.uqcs.org
MIT License
20 stars 20 forks source link

Add a heuristic to the working-on ping #179

Open TheOnlyMrCat opened 1 year ago

TheOnlyMrCat commented 1 year ago

What is the status quo?

Currently, the bot pings two random server members at 5PM asking them what they're working on in an attempt to foster discussion and increase the activity of members.

In practice, this has seen mixed results: most of the time, the pings go completely unnoticed and un-cared for. Sometimes they land on an active server member who excitedly takes the opportunity to share one of their current projects, and sometimes still they land on a non-active member who brings a novel and interesting project to the table.

Unfortunately there's no algorithm we can write to tell if any non-active members have interesting projects they're likely to share, but we might be able to increase the response rate with heuristics somewhat.

What do we want out of a heuristic?

Some heuristics that have come up in discussion on discord:

Prior discussion:

JamesDearlove commented 1 year ago

I had a brief discussion last year in the Discord about this as well, link to UQCS Discord message

bradleysigma commented 1 year ago

So I've been thinking about how to go about this, specifically for how to keep track of who's been recently active, without putting a whole lot of extra stress on the bot. Here's what I've come up with:

andrewj-brown commented 1 year ago

A three-set system seems vastly more complicated than just storing (member, last_seen) and filtering by 3 months < last_seen < 1 month. If you are attached to the heuristic requiring multiple days, you could store (member, last_seen, days_posted), and maintain it by only incrementing days_posted if the last_seen day is earlier than the current day. Are there other advantages to the three-set system that I haven't thought of?

Additionally, I'm unsure if 3-to-1-month-activity is the best idea for a heuristic anyway. Someone who's been inactive for a month is likely to either 1. already not see the ping or 2. be specifically busy with Life:tm: and therefore not want to be pinged.

I think the 2nd discussed heuristic (pick someone active in the last week who hasn't been pinged for at least a month) would work better, but I'm open to arguments for and against.

bradleysigma commented 1 year ago

The main reason I went with this method was to avoid the bot writing to the database for every single message sent, even if the same person sent a message mere minutes ago. I assume that would be computationally expensive. How expensive would it actually be?

JamesDearlove commented 1 year ago

Probably more expensive than our db can handle long term scaling wise, also the potential crashes could be horrific if the db connection locks up.

So realistically these would have to be implemented with in memory sets that are periodically flushed to the db (probably every hour), and on shutdown for when the bot needs to restart.

andrewj-brown commented 1 year ago

I was thinking you'd only store last_seen with a granularity of one day, because that's the only relevant timescale for the rest of the code. You'd only be reading per-message, not writing, although I don't have access to server insights to know how many reads that would actually end up being.