Serene-Arc / bulk-downloader-for-reddit

Downloads and archives content from reddit
https://pypi.org/project/bdfr
GNU General Public License v3.0
2.28k stars 211 forks source link

[BUG] Does not work with PRAW 7.7.1, throws a 429 HTTP Response instead #954

Open iammarxg opened 5 months ago

iammarxg commented 5 months ago

Description

Using PRAW 7.7.1 returns a 429 HTTP Response. Using 7.7.0 does not, and it allows bdfr to run normally.

Command

py -m bdfr download ./REDDIT-DL -s SUBREDDITNAME -S new -t all --file-scheme {DATE}-{TITLE}-{POSTID}

Environment (please complete the following information)

Logs

Traceback (most recent call last):
  File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Python310\lib\site-packages\bdfr\__main__.py", line 222, in <module>
    cli()
  File "C:\Python310\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Python310\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Python310\lib\site-packages\click\core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Python310\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Python310\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Python310\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\Python310\lib\site-packages\bdfr\__main__.py", line 117, in cli_download
    reddit_downloader = RedditDownloader(config, [stream])
  File "C:\Python310\lib\site-packages\bdfr\downloader.py", line 41, in __init__
    super(RedditDownloader, self).__init__(args, logging_handlers)
  File "C:\Python310\lib\site-packages\bdfr\connector.py", line 65, in __init__
    self.reddit_lists = self.retrieve_reddit_lists()
  File "C:\Python310\lib\site-packages\bdfr\connector.py", line 168, in retrieve_reddit_lists
    master_list.extend(self.get_subreddits())
  File "C:\Python310\lib\site-packages\bdfr\connector.py", line 276, in get_subreddits
    self.check_subreddit_status(reddit)
  File "C:\Python310\lib\site-packages\bdfr\connector.py", line 438, in check_subreddit_status
    assert subreddit.id
  File "C:\Python310\lib\site-packages\praw\models\reddit\base.py", line 35, in __getattr__
    self._fetch()
  File "C:\Python310\lib\site-packages\praw\models\reddit\subreddit.py", line 827, in _fetch
    data = self._fetch_data()
  File "C:\Python310\lib\site-packages\praw\models\reddit\base.py", line 89, in _fetch_data
    return self._reddit.request(method="GET", params=params, path=path)
  File "C:\Python310\lib\site-packages\praw\util\deprecate_args.py", line 43, in wrapped
    return func(**dict(zip(_old_args, args)), **kwargs)
  File "C:\Python310\lib\site-packages\praw\reddit.py", line 941, in request
    return self._core.request(
  File "C:\Python310\lib\site-packages\prawcore\sessions.py", line 330, in request
    return self._request_with_retries(
  File "C:\Python310\lib\site-packages\prawcore\sessions.py", line 266, in _request_with_retries
    raise self.STATUS_EXCEPTIONS[response.status_code](response)
prawcore.exceptions.TooManyRequests: received 429 HTTP response
mbarr564 commented 5 months ago

This was working for me just three weeks ago, with praw 7.7.1, and may be something reddit broke on their API recently.
I tried my working environment again today, and I'm now getting HTTP 429 responses.
Thanks for mentioning the praw downlevel workaround.

jgore077 commented 4 months ago

@mbarr564 Is this fix still working?

jgore077 commented 4 months ago

I can confirm that the frequency of 429's goes down when using praw 7.7.0

Serene-Arc commented 1 month ago

We also deal with this more on the development branch. Unfortunately I do not have permissions to update master and make a release, and @aliparlakci is MIA and has been for over a year at this point.

ammacdonald3 commented 1 month ago

@Serene-Arc It seems like you own this repository, but I see you mentioned three weeks ago that you don't have permissions to update master. Has this changed in the last few weeks? Just curious about the future of this project. Thanks for all of your work building it out!