KilledMufasa / AmputatorBot

Remove AMP from your URLs. AmputatorBot is a highly specialised Reddit and Twitter bot that automatically replies to comments, submissions and tweets containing AMP URLs with the canonical link(s). It's also available as a website and REST API. See also: https://www.reddit.com/r/AmputatorBot/comments/ehrq3z/why_did_i_build_amputatorbot/.
https://www.amputatorbot.com/
GNU General Public License v3.0
168 stars 11 forks source link

Some suggestions #2

Closed justinrubek closed 5 years ago

justinrubek commented 5 years ago

Hello! I want to preface this by thanking you for making this bot. I appreciate this kind of initiative. That being said, I read that you were new to Python when making this, so I looked over it and have a few ideas for improvements. I could go through and make some changes, but I figured you may prefer to architect this yourself.

Line 30 of submissions_boy.py and line 36 of commentbot.py both contain a similar line: `for in r.subreddit('amputatorbot+audio+chrome+degoogle+europe+google+firefox+gaming+history+programming+robotics+security+seo+tech+technology+test+todayilearned+worldnews')`

This is a really long, and I might say unruly, line. The contents (the subreddit names) happen to be the same. I'm also assuming that they will remain to stay the same in the future. This means that to update the list somebody will have to make sure to change it in both spots. Instead perhaps you could do something like the following (this probably go in a separate file):

def allowed_subreddits():
    return ["amputatorbot", "audio", "chrome", "degoogle", ... ]

def format_allowed_subreddits()"
   return "+".join(allowed_subreddits())

Doing this could help you print the list of subreddits beforehand with something like print("Obtaining the stream of subreddits", ", ".join(allowed_subreddits())")

This could also support other options for specifying the subreddits in the future. You could change allowed_subreddits to read from a file instead if you chose.

There are plenty of spots that you can do similar tricks to make the code easier to read and improve in the future. Another good example could be a is_amp_url function to avoid doing if "/amp" in submission.url or ".amp" in submission.url or "amp." in submission.url or "?amp" in submission.url or "amp?" in submission.url or "=amp" in submission.url or "amp=" in submission.url and "https://" in submission.url: when needed (although this may not be the best way to name the function as in another spot you're looking through an entire comment. Perhaps contains_amp_url would be better?). In general it is considered a good idea to not repeat yourself. That's one reason why laziness is one of the three great virtues of a programmer.

I'm happy to help make some changes if you'd like. Thanks again for putting your into developing this.

justinrubek commented 5 years ago

I've looked a bit more and here are some other things worth looking into:

KilledMufasa commented 5 years ago

This is an incredibly well written and detailed issue you put out there and I really appreciate you taking the time to ELI5. Thanks a lot!

As to your suggestions:

This is a really long, and I might say unruly, line. / You could change allowed_subreddits to read from a file instead if you chose

I agree. I'm implemented a change to load the subreddits from another file. That will save me some work in the future, so that's awesome.

Doing this could help you print the list of subreddits I've done this and although the general user won't notice, it looks real fancy in the console.

There are plenty of spots that you can do similar tricks to make the code easier to read and improve in the future. (...) Perhaps contains_amp_url would be better?

Again, an excellent point! I've added the contains_amp_url function and the long list of requirements is no longer repeated. My current implementation can use some additional fixes, I don't really like the variable names I've used but that's something for another time.

You're using the logging module and you've also got some debug print statements commented out. It may be worth changing those over to logging.debug so that you can leave them in and switch the logging level when it's time to debug.

This makes total sense as well, but I've decided to wait a bit until I make this change. As I'm still quite new to Python and hell even the console, I don't want to change too much at once. I will monitor the new release for a few days and then fix the printing and logging. That probably sounds silly, but I don't want to make this harder than it could be.

You initialize some lists to be ["empty"]. Is there a reason you want that string in them? I don't see anywhere it needs it, so maybe you could start it off as just an empty list ([])

I originally added this for debugging purposes. If no file was found, it would print empty instead of just an empty line. However, this is a bit overkill as there is already a notice that the array is empty. So this is gone now.

Thanks again for putting your into developing this.

No problem, I like doing it. Almost all of your suggestions have been implemented in V1.3.3 / Commit 91af5ee So THANK YOU! Your issue has helped me a lot.