nanos / FediFetcher

FediFetcher is a tool for Mastodon that automatically fetches missing replies and posts from other fediverse instances, and adds them to your own Mastodon instance.
https://blog.thms.uk/fedifetcher?utm_source=github
MIT License
297 stars 215 forks source link

ValueError year is out of range, HTTP 429 misinterpreted as HTTP 500 #78

Closed fungalcofe closed 10 months ago

fungalcofe commented 10 months ago

Hi, I'm using FediFetcher with GoToSocial who also implements the Masto-API (API Swagger)

However, I get this fatal error:

$ docker run -it ghcr.io/nanos/fedifetcher:latest --access-token=[REDACTED] --server=[REDACTED] --home-timeline-length 200 --from-notifications 1
[skipped output]
2023-08-17 08:03:53.583040 UTC: Error adding url [LONG_URL] to server [INSTANCE]. Exception: year 1692259524 is out of range: 1692259524
2023-08-17 08:03:53.583157 UTC: Added 0 posts for user [REMOTE_ACCOUNT] with 35 errors
2023-08-17 08:03:53.584521 UTC: Getting notifications for last 1 hours
2023-08-17 08:03:53.721718 UTC: Job failed after 0:18:32.419341.
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 649, in parse
    ret = self._build_naive(res, default)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 1235, in _build_naive
    naive = default.replace(**repl)
            ^^^^^^^^^^^^^^^^^^^^^^^
ValueError: year 1692259524 is out of range

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/find_posts.py", line 1439, in <module>
    notification_users = get_notification_users(arguments.server, token, all_known_users, arguments.from_notifications)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/find_posts.py", line 45, in get_notification_users
    notifications = get_paginated_mastodon(f"https://{server}/api/v1/notifications", since, headers={
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/find_posts.py", line 922, in get_paginated_mastodon
    response = get(furl, headers, timeout, max_tries)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/find_posts.py", line 984, in get
    reset = parser.parse(response.headers['x-ratelimit-reset'])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 1368, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/dateutil/parser/_parser.py", line 651, in parse
    six.raise_from(ParserError(str(e) + ": %s", timestr), e)
  File "<string>", line 3, in raise_from
dateutil.parser._parser.ParserError: year 1692259524 is out of range: 1692259524

My wild guess is FediFetcher is getting rate-limited and don't know how to properly act or parse the value, given the Unix timestamp 1692259524; but I haven't checked source code. I've parsed logs and confirm that GoToSocial sent HTTP 429 to FediFetcher:

Aug 17 08:03:53 [INSTANCE] gotosocial[4096]: timestamp="17/08/2023 08:03:53.704" func=middleware.Logger.func1.1 level=INFO latency="36.921µs" userAgent="FediFetcher (https://go.thms.uk/mgr)" method=GET statusCode=429 path=/api/v1/notifications clientIP=[MY_IP_:3] requestID=x2y8a0ma04000wp9t6p0 msg="Too Many Requests: wrote 30B"

Docker image is 8a76c4cb6946

Also related, an HTTP 429 seems to be understood as an error 500 when for most actions, e.g.:

# FediFetcher logs
2023-08-17 08:03:47.068565 UTC: Error adding url [LONG_URL] to server [INSTANCE]. Status code: 500

# GoToSocial logs
Aug 17 08:03:47 [INSTANCE] gotosocial[4096]: timestamp="17/08/2023 08:03:47.192" func=middleware.Logger.func1.1 level=INFO latency="36.36µs" userAgent="FediFetcher (https://go.thms.uk/mgr)" method=GET statusCode=429 path=/api/v2/search clientIP=[MY_IP_:3] requestID=f2hra0ma04000xgqtyv0 msg="Too Many Requests: wrote 30B"

I could 100% confirm if FediFetcher prints the requestID, but it seems like 500 is a wildcard error.

nanos commented 10 months ago

The first part is because GoToSocial doesn't implement the Mastodon API in the same way:

With Mastodon, you'd get back the following HTTP header when the rate limit is hit:

X-RateLimit-Reset: 2023-08-17T08:42:22.000Z

What you got back was

X-RateLimit-Reset: 1692259524

FedFetcher tried to parse that number as a year (rather than Unix Epoch seconds) and that's why you are seeing that error.

I'll see what I can do about this.


With regards to the 500 / 429 errors: FediFetcher literally just logs the response code it receives. Either your GoToSocial log relates to a different event/request, or something sits between that log and FediFetcher that converts the 429 to 500.

nanos commented 10 months ago

I'm closing this issue now, as it appears to have been fixed in GoToSocial.