rwxd / wallabag2readwise

Export / synchronize wallabag annotations to readwise highlights
https://rwxd.github.io/wallabag2readwise/
MIT License
15 stars 0 forks source link

1.2.9 still triggers the rate limiter #14

Open Alanon202 opened 1 year ago

Alanon202 commented 1 year ago

On running 1.2.9 I can’t seem to complete a sync because it again triggers the rate limiter (response 429 - ReadwiseRateLimitException).

For what it’s worth, on the previous version (1.0.0), I was able to get around this by adding time.sleep(3) down in the get function, which slowed down the sync significantly, but ensured that I always remained within the designated limits for the requests. I know it’s rudimentary, but the hardcoded time limit ensured that at no point would I exceed the 20 per minute limitation.

At this point I have hundreds and hundreds of entries (maybe that’s part of the issue?), so if you need any testing done, I’m available.

rwxd commented 1 year ago

Hi, if Readwise rate limits the client it gives back a "Retry-After" Header with the seconds to wait for. https://readwise.io/api_deets

Version 1.2.11 now sleeps for the value.

Could you try it again on your site? In my small test it looked like it works.

❯ python3 test.py
call
<Response [200]>
..............
call
<Response [200]>
call
WARNING - 2022-12-25 11:59:44,025 - wallabag2readwise - Rate limited, retrying in 44 seconds
<Response [200]>
call
<Response [200]>
..............
call
WARNING - 2022-12-25 12:00:59,764 - wallabag2readwise - Rate limited, retrying in 29 seconds
Alanon202 commented 1 year ago

Sadly, I’m getting RateLimitException: too many calls

rwxd commented 1 year ago

Could not reproduce the problem.

Now in version 1.2.12 the RateLimitException should not be thrown anymore, instead the thread sleeps.

Please give it a try :)

Alanon202 commented 1 year ago

BTW, I’m also getting some JSON errors with these newer versions.

Example of (badly pasted) terminal output:

/home/stefan/.local/lib/python3.10/site-packages/requests/models.py:975 in json

│ 972 │ │ except JSONDecodeError as e: │ 973 │ │ │ # Catch JSON-related errors and raise as requests.JSONDecodeEr │ 974 │ │ │ # This aliases json.JSONDecodeError and simplejson.JSONDecodeE │ ❱ 975 │ │ │ raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) │ 976 │ │ 977 │ @property │ 978 │ def links(self): │ │ ╭───────── locals ────────── │ │ kwargs = {} │ │ │ │ self = <Response [200]> │ │ │ ╰──────────────────────╯
╰────────────────────────────────────────────────╯ JSONDecodeError: Unterminated string starting at: line 1 column 117073 (char 117072)

Alanon202 commented 1 year ago

UPDATE: After several attempts at starting a sync with the errors above, I was able to complete a full sync cycle. I’m not sure what going on with the rate limiter, but looking at the terminal output, I’ve found that there are a lot of smaller rate limits being applied as the sync progresses.

For example, after finding about 15 articles and going through them, there was a sequence of smaller rate limitations. I suppose that something like this was preventing a full sync before:

WARNING - 2022-12-25 13:27:58,868 - wallabag2readwise - Rate limited, retrying in 3 seconds
=> Found 2 Readwise highlights for "XYZ"
WARNING - 2022-12-25 13:28:03,454 - wallabag2readwise - Rate limited, retrying in 2 seconds
=> Found 4 Readwise highlights for "XYZ"
==> Adding highlight
==> Adding highlight
WARNING - 2022-12-25 13:28:07,405 - wallabag2readwise - Rate limited, retrying in 1 seconds
=> Found 1 Readwise highlights for "XYZ"
WARNING - 2022-12-25 13:28:35,278 - wallabag2readwise - Rate limited, retrying in 9 seconds
=> Found 2 Readwise highlights for "XYZ"
=> Found 2 Readwise highlights for "XYZ"
WARNING - 2022-12-25 13:28:45,646 - wallabag2readwise - Rate limited, retrying in 7 seconds
Alanon202 commented 1 year ago

After numerous tests during the past few days, I haven’t had a single failure with the rate limitation. 👍🏼

However, I have had many instances of 200 response like the one I already described above. The solution so far seems to be to just try again and again until it starts properly.

Also, am I right in thinking that the first time a new article with highlights is synced, the tags for it will be synced on the next synchronisation?

rwxd commented 1 year ago

Now tags should be added also on newly created articles in Readwise.


However, I have had many instances of 200 response like the one I already described above. The solution so far seems to be to just try again and again until it starts properly.

Is this the Wallabag or Readwise API which has that problem? We could check if the body is valid json.

Alanon202 commented 1 year ago

Based on just the recent behaviour, I’d say that it’s probably Wallabag? The errors seem to have increased today as I was unable to initiate a successful sync at all, and the only change is that I have new highlights. How can I test this further?

rwxd commented 1 year ago

On which wallabag version are you?

If you scroll up the stacktrace there should be the function in which the json decoding errors happen and maybe also the corresponding Class (WallabagConnector or ReadwiseConnector).

Could you maybe post the entire stacktrace? But be aware to censor your readwise and wallabag secrets.

Alanon202 commented 1 year ago

Here’s the entire stacktrace for a sync I just ran. I’m using the wallabag.it service and whatever version they’re on.

rwxd commented 1 year ago

It seems like it is the Readwise API, that gives back some ugly body. Maybe there is something that does not get escaped.

│ ╭───────────────────────────────────── locals ──────────────────────────────────────╮ │
│ │  idx = 0                                                                          │ │
│ │    s = '{"count":482,"next":null,"previous":null,"results":[{"id":22626703,"titl… │ │
│ │        '+286214                                                                   │ │
│ │ self = <json.decoder.JSONDecoder object at 0x7ff2dea6e8c0>                        │ │
│ ╰───────────────────────────────────────────────────────────────────────────────────╯ │
╰───────────────────────────────────────────────────────────────────────────────────────╯
JSONDecodeError: Expecting ',' delimiter: line 1 column 286295 (char 286294)

You said it does not happen all the time. I pushed an update where the request for all books/articles get retried up to 15 times.

Alanon202 commented 1 year ago

I initiated a sync, I seem to be getting a lot of these errors:

ERROR - 2022-12-30 16:08:01,285 - wallabag2readwise - Error while getting Readwise articles: "Unterminated string starting at: line 1 column 171970 (char 171969)"
ERROR - 2022-12-30 16:08:01,285 - wallabag2readwise - Retrying in 5 seconds, 14 retries left
ERROR - 2022-12-30 16:08:08,106 - wallabag2readwise - Error while getting Readwise articles: "Unterminated string starting at: line 1 column 156130 (char 156129)"

Almost every article that had new highlights also had at least one of these occur, and for some it took up to 11 attempts before it went away. I checked a few such articles, it seems that in the end all the highlights were imported properly.

I’m not sure why this happens, though? The previous versions didn’t seem to have this problem at all? Or is the issue with my highlights?

The vast majority of my highlights are only sentences or paragraphs, there are some bullet lists here and there, or sometimes the end of a paragraph and the beginning of the next one end up in one highlight.