bitbybyte / fantiadl

Download posts and media from Fantia
MIT License
288 stars 52 forks source link

Better rate limiting #78

Open ghost opened 2 years ago

ghost commented 2 years ago

I got the error 429 - Too many requests for bit, so I would suggest adding or increasing the rate limits tiny bit to avoid this

bitbybyte commented 2 years ago

Can you let me know what sort of command you were running (downloading fanclub(s), post, URL list, etc)?

ghost commented 2 years ago

Yeah, it was downloading fanclub without limitations on posts

itsaferbie commented 6 months ago

I have been getting more 429 errors as of late, even when only downloading from a few of the most recent dates.

Any way to potentially put a limit between each download?

CodeAsm commented 1 month ago

EDIT: manually going to that url, "読み込みに失敗しました。" translated: Failed to read. EDIT2. Ok so im ofcourse ratelimited ofcourse in the browser XD

Same error here: python3 fantiadl.py --db ~/fantiadl.db https://fantia.jp/fanclubs/15399 Warning, this is a adult link, so 18+ or dont test nor look.

Downloading post 2883811...
Traceback (most recent call last):
  File "/usr/lib/python3.12/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/urllib3/connectionpool.py", line 896, in urlopen
    return self.urlopen(
           ^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/urllib3/connectionpool.py", line 896, in urlopen
    return self.urlopen(
           ^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/urllib3/connectionpool.py", line 896, in urlopen
    return self.urlopen(
           ^^^^^^^^^^^^^
  [Previous line repeated 2 more times]
  File "/usr/lib/python3.12/site-packages/urllib3/connectionpool.py", line 886, in urlopen
    retries = retries.increment(method, url, response=response, _pool=self)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/urllib3/util/retry.py", line 594, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='fantia.jp', port=443): Max retries exceeded with url: /api/v1/posts/2883811 (Caused by ResponseError('too many 429 error responses'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/codeasm/Videos/ddd/さくらこ/download/fantiadl/fantiadl.py", line 122, in <module>
    downloader.download_fanclub(fanclub, cmdl_opts.limit)
  File "/home/codeasm/Videos/ddd/さくらこ/download/fantiadl/models.py", line 233, in download_fanclub
    self.download_post(post_id)
  File "/home/codeasm/Videos/ddd/さくらこ/download/fantiadl/models.py", line 518, in download_post
    response = self.session.get(POST_API.format(post_id), headers={
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/requests/adapters.py", line 691, in send
    raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPSConnectionPool(host='fantia.jp', port=443): Max retries exceeded with url: /api/v1/posts/2883811 (Caused by ResponseError('too many 429 error responses'))
bitbybyte commented 1 month ago

I'm not sure how much more aggressive we need to be on backing off. Might be nice to know if the 429 responds with anything useful. Maybe make max retries customizable with a flag?

Note to future self: the API sometimes responds with a 422 and text システムエラー if you request too quickly against it, so we should catch that code too.