dteviot / WebToEpub

A simple Chrome (and Firefox) Extension that converts Web Novels (and other web pages) into an EPUB.
Other
689 stars 134 forks source link

Allow arbitrary values for maximum webpage requests and throttle settings. #1094

Open nozwock opened 11 months ago

nozwock commented 11 months ago

I don't understand why you'd restrict them in a combobox with a limited number of options. I'd like to be able to select values according to my specific requirements, including very short values like 0.2s or any other value. I'd get errors like 403 when there's no throttle, which is to be expected, but the sole higher option available is an excessively long 3 seconds per chapter. The same limitation applies to the max webpage requests setting.

I'm aware that I can adjust the values in local storage, but I'd appreciate it if you could substitute the comboboxes with a more versatile widget, like an input box or spinner box, that permits arbitrary values.

gamebeaker commented 11 months ago

@nozwock i am not sure if you are aware of the symbios relationship between "Max web pages to fetch simultaneously" and "Manual Throttle|Delay per chapter" it is described here: https://github.com/dteviot/WebToEpub/wiki/Advanced-Options#manual-throttledelay-per-chapter Example: Select 4 "Max web pages to fetch simultaneously" and 3 Secs/Chapter "Manual Throttle|Delay per chapter" you get 4 Chapter / 3 Secs = 1,3 Chapter/ Secs It is just fyi the ability to set custom numbers would be a cool feature.

nozwock commented 11 months ago

@gamebeaker Ah, I had assumed that values for maximum webpage requests and throttle settings were for a semaphore and a rate limiter, but it seems like only a rate limiter is being used. I also wasn't expecting the "Chapter" in "3 Secs/Chapter" to be variable.

If I understand correctly, throttling is for limiting a number of requests over a time period. For instance, with 40 reqs per 3 secs, the limiter would allow a max of 40 reqs over a 3 secs-window. However, rate limiting doesn't ensure an even distribution of requests over that time period. This means that we might have 20 reqs sent within a very short duration, such as 20ms, which could result in issues like 403 errors or even bans.

I don't see any burst time limit option (number of requests that can be sent in a very short duration) in the settings to mitigate this? However, at least now I can play around with theses values better, knowing that they're bounds for a rate limiter.

gamebeaker commented 11 months ago

@nozwock If you open the Inspect window from Chrome or Firefox in WebToEpub and choose the Network tab you can see, that in the Example from above:

Select 4 "Max web pages to fetch simultaneously" and 3 Secs/Chapter "Manual Throttle|Delay per chapter" you get 4 Chapter / 3 Secs = 1,3 Chapter/ Secs

4 Pages get requested "at the same time" and after it finished loading all 4 it waits 3 seconds before it requests the next batch of 4 Pages "at the same time" etc. I hope this makes it clear how it works.

If I understand correctly, throttling is for limiting a number of requests over a time period. For instance, with 40 reqs per 3 secs, the limiter would allow a max of 40 reqs over a 3 secs-window.

In reality it is going to be slower depending on you internet speed. Example you request 20 Pages and they need 10s before they are loaded. After they are loaded they 3s delay starts counting. => So the time needed to load 20 Pages is 13s on avarage.

However, rate limiting doesn't ensure an even distribution of requests over that time period. This means that we might have 20 reqs sent within a very short duration, such as 20ms, which could result in issues like 403 errors or even bans.

Correct, i would say instead of 20ms it is at the same time.(For humans there is no delay)