rexagod / newman

A discord bot, dedicated to Seinfeld's Wayne Knight (AKA "Newman").
GNU Affero General Public License v3.0
3 stars 5 forks source link

More features to logging deleted messages #2

Open PrimalPimmy opened 2 years ago

PrimalPimmy commented 2 years ago

Currently the bot shows every last delete message event when nsdel is used. However this can get cluttered really quickly as more deleted things pile up. I'd like to propose two features:

  1. Take an additional input for the command like nsdel 5 , which then reveals the last 5 deleted messages in the server.
  2. Along with the first point above, a feature to show a specific user's deleted messages would be great as well. Like nsdel @Pimmy. This feature would also have an optional argument to cap the number of deleted messages to show, like the 1st feature I mentioned above, but like nsdel @Pimmy 10
exitflynn commented 2 years ago

I'd love to take this issue up! (̿▀̿ ̿Ĺ̯̿̿▀̿ ̿)̄

rexagod commented 2 years ago

Awesome, thank you!

exitflynn commented 2 years ago

hey i might not be able to work on the issue in the foreseeable future so unassigning myself in case anyone else would like to pick it up

rexagod commented 1 year ago

However this can get cluttered really quickly as more deleted things pile up.

The bot only stores the last deleted message and outputs that, not sure what you mean by "cluttered" here?

rexagod commented 1 year ago

Adding user or last N filters will require us to keep atleast len(users)*N messages in the database at all times, which can landslide into higher bills in the event of a raid, etc., and just an unbounded storage idea in general.

PrimalPimmy commented 1 year ago

Adding user or last N filters will require us to keep atleast len(users)*N messages in the database at all times, which can landslide into higher bills in the event of a raid, etc., and just an unbounded storage idea in general.

Hence we keep two options now, one is nsnipe, which snipes one message, like normal. The other would be something like nsdel (number) . This number is ranged from 1-10, no more than that and it does not make sense to keep more than 10, unless you truly want to be a menace in the server. 🗿

Basically.. We watch delete event. If delete occurs, read and store that message in array. Do this at max 10 times and discard the oldest delete once array is full.

Now, if this bot runs in one server, we can keep this without touching the database (store it in slices). However not sure if keeping the last 10 deleted messages in EACH channel of a server will have memory implications, so using sql might be better? Would also help if the bot is in multiple servers.

There is a reason why sniping one message is more of a common feature than sniping 10 deleted messages.

rexagod commented 1 year ago

There is a reason why sniping one message is more of a common feature than sniping 10 deleted messages.

Exactly. 10 messages 1000+ members 1000 bytes (emojis are double-width, more bytes), is a storage concern. We watch the delete events currently, but slices and arrays are primitive Golang types that would be preferred to be as short as possible. This is deeply rooted in the fact that arrays and slices (for which the underlying data structure is an array) require the deletion of the current cap(2^N) array and the creation of another cap(2^(N+1)) array internally. This essentially means a database is the goto choice for unbounded storage cases such as this, no matter the quantity.

rexagod commented 1 year ago

(which is why a map-based linked-list would be undoubtedly the better solution here, but that's just a start, databases are essentially structures that use the most efficient method like so for the respective use-case, which is why we delegate such tasks to them instead of doing that ourselves, but being cautious about storage limits even then (these are unbounded, after all))

PrimalPimmy commented 1 year ago

Yeah, let's go with the database on this one then.

rexagod commented 1 year ago

Correction: s/1000+ members/10+ active channels at any given time as pointed out by @PrimalPimmy and @deadaf, which makes sense. Reopening.

deadaf commented 1 year ago

in the rewrite, I am not using arrays I am using a database to store all this, so i don't think 1,000 or even 10,000+ members will cause any significant effect.

snipes (
    message_id BIGINT PRIMARY KEY,
    channel_id BIGINT NOT NULL,
    author_id BIGINT NOT NULL,
    content JSONB DEFAULT '{}'::jsonb,
    deleted_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

and as @PrimalPimmy suggested, an option to retrieve deleted messages of a particular user makes sense too and is possible.

rexagod commented 1 year ago

10,000+ members * 10 messages * 50 forums (threads and channels) * 1000 bytes is 5Gs. I saw someone in the server asking if can go for 100 messages, i.e., 50Gs.

I'm fine with a million messages as long as I'm not the one paying.

deadaf commented 1 year ago

we don't have to store them forever, we can drop rows older than a set no. of days or maybe set a limit to the no. of messages / channel, let me share some stats, as you already know, Quotient do have a similar feature, the diff is that we don't keep rows older than 7 days and there's only one row / channel (we don't store multiple messages).

so here, last 7 days: image

avg row size: image

at OSDC we have 62 text channels & 1220 members as of now & not all of them are active. even if we consider future growth GBs of data still doesn't seem possible.

also, snipe isn't supposed to be a history / log of all deleted messages, for that we can just log them in a channel if we want to. snipe in most cases is just used instantly after a msg is deleted.

rexagod commented 1 year ago

[...] future growth GBs of data still doesn't seem possible.

I'm talking about the upper bound, which is not probable, but absolutely possible with the initially considered figures as mentioned in https://github.com/rexagod/newman/issues/2#issuecomment-1696211611. With the newer figures, it'd come down to ~7.5Gs at any given point of time (assuming we rotate the SQL-based database at that storage cap) if there are 62 forums * 1.2k members * 1k bytes * 100 messages.

rexagod commented 1 year ago

The week-long rotation (in addition to the size cap) seems the direction we'd want to go in.

[...] snipe in most cases is just used instantly after a msg is deleted.

Right, perhaps we could get away with storing only >5 messages per user, or >10-ish messages per forum, because anything over that sounds kinda verbose.