Floogen / jmod-bloodhound

Hunts down posts with JMOD replies on r/2007scape
MIT License
18 stars 1 forks source link

A few comments #3

Open gurchik opened 5 years ago

gurchik commented 5 years ago

Hello, I have a few suggestions. Rather than create an Issue for each one I've decided to lump them all together in this one.

I'm writing these suggestions because I think the bot is cool and I'd like to help in any way I can. I can even implement some of these suggestions in code and submit a pull request, but I wanted to run them by you first in case so we can discuss them first, and to give you a chance to write the code yourself if you'd prefer.

  1. It would be cool if this bot could be generalized, since I can think of it being cool to use on other subreddits. For example, the bot could be rewritten to load in a config file, which instructs the bot which subreddits to crawl, what users or flair css classes to find, and some other settings. Then people can adapt this bot for other subreddits without needing to understand or change the code, just the configuration files.
  2. The current design of the bot puts puts unnecessary strain on the Reddit servers. Don't get me wrong, this bot isn't going to crash the website, but if you've seen as many "You broke Reddit!" error messages as I have seen over the years, you'd agree that we should minimize any unnecessary strain on the servers as we can and be responsible developers on this free website. I have a lot of suggestions, but a few of them are:
    1. The bot currently saves all its stateful data to the "archive" subreddit. The bot needs this subreddit to function, and as such it loads all the posts and comments on this subreddit every time its run. This is pretty inefficient. In a way you're using the archive subreddit as a database for your application, which is a gray area of the terms of service. Some subreddits have been banned for doing this (like /r/A858DE45F56D9BC9, although to be fair you are not blatantly and deliberately using Reddit's servers for personal gain like they were doing). In my opinion a better way to run this bot would be to store the database in a local file and get rid of the archive subreddit. The archive subreddit is a good idea in case we were worried about jmods editing or deleting their posts, but I believe this rarely happens, and even if it were common, your bot currently only archives a few words of a jmod's comment so it's not effective at creating a true archive. Rest assured this database file should not take up too much space; with some quick back-of-an-envelope math I estimate it will grow at a rate of only a few dozen megabytes per year (you could run it for a century and only use up a gigabyte or two!). And if you're worried about losing the file or not wanting to bother with it when moving the bot to another computer, we can program in a safemode to search the bot's previous comments to rebuild the database. Better to do this once in a blue moon than every 5 minutes like the bot currently does.
    2. Every five minutes the bot gets the Hottest 100 posts in the subreddit and searches the comments in those posts for jmod comments. I have a few ideas where we can ignore posts in certain situations to save some effort from the servers.
EmmaLouS commented 5 years ago

Came here to suggest much the same but you've already hit the nail on the head @dere!

Absolutely agree, separating subreddit/flair/username details out to be tweaked by end-users and generalising it for use across other subreddits would be fantastic!

@dere, regarding 2.ii - what ideas do you have for this? I've come up with a few to improve the bot re-doing the same work and reducing hits to Reddit, but they would require the bot to be using a database for state and logging of its own actions (much like you've suggested for another point). I'm undecided at the moment whether it's best to go with the local db file approach (short term perhaps?), or whether to have something hosted that multiple bots can share state with (I think I may just be getting overexcited here and this probably wouldn't ever be a bridge this project needs to cross).

gurchik commented 5 years ago

@EmmaLouS regarding 2.ii, there are several things I've noticed.

First, the old school team is much smaller than the rs3 team, and thus do not respond to as many threads on Reddit. They usually only respond to a post when it gets to the front page, or the second page. The current policy to check all the comments in the top 100 posts on the subreddit (at 20 posts per page, that's five pages) is too much. But on the main runescape subreddit, they respond to many more posts and perhaps 100 is an appropriate number. This should be a configurable option on a per-subreddit basis.

Secondly, the bot's main purpose should be to reveal a jmod comment that would otherwise be hidden. In other words, the bot should only be posting a comment when a jmod comment is "hard to notice." The exact criteria for what is "hard to notice" should be a configurable option on a per-subreddit basis. I do not have a clear idea on what this criteria should be for either of the two subreddits, but some ideas of when the bot should not create a comment are when:

Floogen commented 5 years ago

@dere @EmmaLouS

Hey first of, I want to apologize for the delay in communication. I recently graduated and my previous notifications were sent to my now defunct college email.

I like your idea of converting the script into a more general application, as it is something that has been asked for before. I believe the Apex subreddit used some derivative of the script in order to get their dev's replies catalogued. It definitely wouldn't be too hard to implement, especially with the idea of a config file of sorts to determine the targeted subreddit, flair and etc.

I am also interested in any ideas you have to reduce the unnecessary strain the bot causes.

As for the topic of a local database, I can see the benefit of it but I think you misunderstand the point of the archive subreddit. It was just mostly made as a place to store all the JMOD's comments in one central location. From what I can remember, the bot doesn't currently use that subreddit to track what posts it has touches (it does refer to it to update the post on the archive subreddit once it finds the match again on /r/2007scape and /r/runescape).

Sorry again the delay. Next time I'll be quicker to reply as my notifications are reconfigured to send back to my main email. Thank you both for your suggestions and interest in the bot!