Closed trodiz closed 1 year ago
Are you sure it's not the download server that's limiting your download speed? I just tested a bit with large Danbooru files and -r/--limit-rate
works as expected there.
The chunk size is rather small and there should definitely be an option to configure it, but I don't think that is what's primarily limiting your speeds.
Can't say that I've ever noticed gallery-dl bottlenecking my download rate..
The site/server is usually the limiting factor (or a proxy, if used), on many occasions gallery-dl was saturating my download bandwidth just fine..
Thanks for the reply folks. I am having this issue on all sites that I have tested such as reddit, imgur, gfycat etc. If it is allowed, I can share a link here.
Changing the chunk_size
in the code does fix this issue for me. Also, without the rate limit (no -r
flag) everything downloads at full-speed as expected. In this case the receive
function is called which also uses default chunk_size
. So it must be the 100s of iterations inside the _receive_rate
function that is causing the slowdown. It has a few math operations and an IO call for writing the progress bar to the console. There is a good chance that all of this noticeable due to my computer being very old and slow.
ps: I'm using the latest version of the library. (gallery-dl==1.23.5)
due to my computer being very old and slow.
That might be the reason for your problem, given how "fast" Python is.
The simple receive()
only gets used when there is no rate limit and no download progress. Otherwise the more computationally expensive _receive_rate
is used.
Anyway, commit https://github.com/mikf/gallery-dl/commit/bca9f965e5f5009f66fb19e52cb0fd9cc054a368 adds a chunk-size
option with which you can configure this value however you want without touching any actual code, and it also increases the default chunk size to 2^15 = 32k (from 2^14). 1MiB is far too much as a default value, I think.
@mikf Thanks for adding the option. Meanwhile I profiled the code and narrowed the issue down to this:
>>> import time
>>> def do():
... t1 = time.perf_counter()
... for i in range(1000):
... time.sleep(0.01)
... t2 = time.perf_counter()
... dt = t2 - t1
... print(f'{dt=}', f'exp={0.01*1000}')
...
>>> do()
dt=15.758051600001636 exp=10.0
Turns out on my machine time.sleep
is only accurate up to ~10ms range. So each iteration in the for loop is sleeping a tiny bit more than it needs to. This error accumulates over 100s of loops. A bigger chunk size reduces the iterations and solves the problem. Here's the relevant line from the profiling output:
with 16KiB chunk size
ncalls tottime percall cumtime percall filename:lineno(function)
1170 16.095 0.014 16.095 0.014 {built-in method time.sleep}
and with 1MiB chunk size
ncalls tottime percall cumtime percall filename:lineno(function)
19 5.855 0.308 5.855 0.308 {built-in method time.sleep}
This is specific to my computer and probably not an issue on most modern machines. Once again thanks for your help.
@trodiz I think I found a solution for inaccurate time.sleep()
durations, but I'm not sure if it actually works outside my own tests. Could you try the following patch and report back if it works for you even with a small chunk-size
?.
t2 = time.time()
was meant to counteract an inaccurate time.sleep()
duration, but apparently it does the opposite.
diff --git a/gallery_dl/downloader/http.py b/gallery_dl/downloader/http.py
index 2e7e76e6..69dc4813 100644
--- a/gallery_dl/downloader/http.py
+++ b/gallery_dl/downloader/http.py
@@ -301,7 +301,7 @@ class HttpDownloader(DownloaderBase):
if elapsed < expected:
# sleep if less time elapsed than expected
time.sleep(expected - elapsed)
- t2 = time.time()
+ t2 += expected - elapsed
t1 = t2
Although it eliminated the inaccuracy in time.time
, it still did not compensate for the error in time.sleep
. Any sleep interval below 15ms is not possible on Windows 10 unless I adjust the time resolution manually.
Here's an example of a file I'm trying to download:
FILE SIZE bytes | CHUNK SIZE bytes | LOOPS | LIMITED RATE bytes/sec | UNLIMITED RATE bytes/sec | SLEEP PER LOOP milli sec |
---|---|---|---|---|---|
16,777,216 | 32,768 | 512 | 3,145,728 | 7,340,032 | 5.952 |
16MiB
file should take 5.33s
to finish at a rate of 3MiBps
2.29s
(7MiBps
is the max speed I can get)3.05s
needs to be slept, or 5.95ms
per loopHere's a solution that should work on all Windows machines:
import ctypes
import platform
if platform.system() == "Windows":
ntdll = ctypes.WinDLL('NTDLL.DLL')
ntdll.NtSetTimerResolution(0)
This improves the time resolution to 1ms or lower and everything will run as expected.
the variable in question: https://github.com/mikf/gallery-dl/blob/0f9dfb7e6298864083b883b8a1f29754ab31f292/gallery_dl/downloader/http.py#L30
When you set a rate limit, eg:
-r 3M
, instead of getting a download rate closer to that specified value you'll get a much lower value. In my case it was throttled at 1 mega bytes per second. The very smallchunk_size
creates an overhead in each iteration of the for loop in_receive_rate
function. I'd recommend a much more commonly used chunk size of 1,048,576 (1MiB).