KeithTheEE / redditPythonHelper

A reddit bot to help direct questions to r/learnpython
MIT License
6 stars 2 forks source link

Reddit bot is commenting multiple times #1

Open KeithTheEE opened 4 years ago

KeithTheEE commented 4 years ago

Here and here are two of the instances of multi comments.

Obviously this violates the bots guiding principles pretty pointedly, and needs to be address.

In both cases the logs read like this,


2019-07-25 15:38:20,172 - DEBUG - sessions.py:_log_request():49 - Fetching: POST https://oauth.reddit.co$
2019-07-25 15:38:20,172 - DEBUG - sessions.py:_log_request():50 - Data: [('api_type', 'json'), ('text', $
2019-07-25 15:38:20,173 - DEBUG - sessions.py:_log_request():51 - Params: {'raw_json': 1}
2019-07-25 15:38:20,577 - DEBUG - connectionpool.py:_make_request():393 - https://oauth.reddit.com:443 "$
2019-07-25 15:38:20,586 - DEBUG - sessions.py:_make_request():100 - Response: 500 (50 bytes)
2019-07-25 15:38:20,587 - WARNING - sessions.py:_do_retry():88 - Retrying due to 500 status: POST https:$
2019-07-25 15:38:20,588 - DEBUG - sessions.py:_retry_sleep():58 - Sleeping: 0.16 seconds
2019-07-25 15:38:20,751 - DEBUG - sessions.py:_log_request():49 - Fetching: POST https://oauth.reddit.co$
2019-07-25 15:38:20,752 - DEBUG - sessions.py:_log_request():50 - Data: [('api_type', 'json'), ('text', $
2019-07-25 15:38:20,752 - DEBUG - sessions.py:_log_request():51 - Params: {'raw_json': 1}
2019-07-25 15:38:20,853 - DEBUG - connectionpool.py:_make_request():393 - https://oauth.reddit.com:443 "$
2019-07-25 15:38:20,859 - DEBUG - sessions.py:_make_request():100 - Response: 200 (106 bytes)
2019-07-25 15:38:20,861 - ERROR - archiveAndUpdateReddit.py:commentOnSubmmission():1168 - Caught Server $
2019-07-25 15:38:20,862 - ERROR - archiveAndUpdateReddit.py:commentOnSubmmission():1169 -
Traceback (most recent call last):
  File "/home/pi/Documents/filesForProgramming/Reddit/pythonHelpBot2/utils/archiveAndUpdateReddit.py", l$
    submission.reply(msg)
  File "/usr/local/lib/python2.7/dist-packages/praw/models/reddit/mixins/replyable.py", line 16, in reply
    return self._reddit.post(API_PATH['comment'], data=data)[0]
  File "/usr/local/lib/python2.7/dist-packages/praw/reddit.py", line 446, in post
    return self._objector.objectify(data)
  File "/usr/local/lib/python2.7/dist-packages/praw/objector.py", line 122, in objectify
    raise APIException(*errors[0])
APIException: RATELIMIT: 'you are doing that too much. try again in 5 seconds.' on field 'ratelimit'

Once the error is caught, the bot follows the advice and tries again. However it seems that the first action succeeded already,

2019-07-25 15:39:22,418 - DEBUG - sessions.py:_log_request():49 - Fetching: POST https://oauth.reddit.co$
2019-07-25 15:39:22,419 - DEBUG - sessions.py:_log_request():50 - Data: [('api_type', 'json'), ('text', $
2019-07-25 15:39:22,420 - DEBUG - sessions.py:_log_request():51 - Params: {'raw_json': 1}
2019-07-25 15:39:22,622 - DEBUG - connectionpool.py:_make_request():393 - https://oauth.reddit.com:443 "$
2019-07-25 15:39:22,627 - DEBUG - sessions.py:_make_request():100 - Response: 200 (106 bytes)

Emphasizing sessions.py:_make_request():100 - Response: 200 (106 bytes)

but praw still raises the ratelimit error.

This appears to be an issue with the way praw handles exceptions rather than the bot itself, however a workaround bodge could be to wait a short interval and check if a comment on the post by the bot exists prior to retrying. This might be easiest by checking for the most recent comment by the bot.

It should also be noted that the bots account only shows one successful message regardless of the submission itself showing duplicates. This also translates to my account only showing one username mention even if there's multiple comments by the bot on the same post. I'm not yet sure how this translates to returned comments/most recent comments on an api query.

Also worth noting: the two posts this occurred with were key word response posts, not posts which waited two hours prior to commenting

KeithTheEE commented 4 years ago

2019-08-31 13:45:35,291 - DEBUG - archiveAndUpdateReddit.py:comment_duplication_by_ratelimit_check():1146 - Got Rate Limit Error, Waiting and checking if comment went through

Log makes note of error condition, however, according to this post, it doesn't work

https://www.reddit.com/r/Python/comments/cxuojt/

There's clearly an error with reddit or praw in these cases, where the database is returning an old version of itself, saying it's rejecting all updates, but quietly passing through and updating. This causes the bot to assume, and on its check, verify that nothing happened, and thus the bot tries to update. Evidence of this can be found in the disparity between the displayed number of comments (12) and the number of comments by the bot alone (24).

Given this, the best course of action is either to wait longer and check (maybe design a comment queue+execution time setup) or to do the easier action, and assume it was actually successful and continue on my merry way. For this setup I'm going to change the behavior so that on a 500 while commenting, it assumes a success and moves on.

KeithTheEE commented 4 years ago

While comment_duplication_by_ratelimit_check(reddit, submission) under archiveAndUpdateReddit has been added, this is not a perfect solution.

In the event of a server timeout, praw will first capture and try again before escalating to the bot.

2020-03-04 11:37:14,930 - WARNING - sessions.py:_do_retry():88 - Retrying due to ReadTimeout(ReadTimeoutError("HTTPSConnectionPool(host='oauth.reddit.com', port=443): Read timed out. (read timeout=16.0)",),) status: POST https://oauth.reddit.com/api/comment/

This means in the event of this submission the bot commented twice without running the rate limit check code. In fact, at this point in the code, praw did not raise a single error, only a warning, so it slipped through the exception handler which prompts the comment_duplication_rate_limit_check.

The source of the error is somewhat unclear still. While it was a read timed out error, redditstatus isn't showing any outstanding inidents around the time of this, about 2020-03-04 11:36:58 UTC