mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
10.79k stars 890 forks source link

[Feature Request] Gallery download progress (currentImageNum/totalImageCount) #2526

Open i-rinat opened 2 years ago

i-rinat commented 2 years ago

Some galleries take a long time to download. It would be nice to see how many images were already downloaded and how many images a gallery contains in total. Something like this:

[1/192] ./gallery-dl/exhentai/575955 Landscapes for wallpapers/575955_0001_09fb18b275_205a0e2d88373b4ecdfca36b.jpg
[2/192] ./gallery-dl/exhentai/575955 Landscapes for wallpapers/575955_0002_c292ba0814_0e534185eeccb9277af05545.jpg
[3/192] ./gallery-dl/exhentai/575955 Landscapes for wallpapers/575955_0003_e19860cc0c_0f153b4e8340a77f8882a12e.jpg
[4/192] ./gallery-dl/exhentai/575955 Landscapes for wallpapers/575955_0004_1997bf2331_0f413b4e8314a77f8882a1c6.jpg
...

Here is a sample patch, which unfortunately only works for Exhentai galleries:

diff --git a/gallery_dl/downloader/http.py b/gallery_dl/downloader/http.py
index 56224626..ee734e2a 100644
--- a/gallery_dl/downloader/http.py
+++ b/gallery_dl/downloader/http.py
@@ -224,7 +224,7 @@ class HttpDownloader(DownloaderBase):
                         self._adjust_extension(pathfmt, fp.read(16))
                     fp.seek(offset)

-                self.out.start(pathfmt.path)
+                self.out.start(util.gallery_progress(kwdict) + pathfmt.path)
                 try:
                     self.receive(fp, content, size, offset)
                 except (RequestException, SSLError, OpenSSLError) as exc:
diff --git a/gallery_dl/job.py b/gallery_dl/job.py
index 044369ab..249c6f2f 100644
--- a/gallery_dl/job.py
+++ b/gallery_dl/job.py
@@ -264,7 +264,7 @@ class DownloadJob(Job):

         # download succeeded
         pathfmt.finalize()
-        self.out.success(pathfmt.path, 0)
+        self.out.success(util.gallery_progress(kwdict) + pathfmt.path, 0)
         self._skipcnt = 0
         if archive:
             archive.add(kwdict)
@@ -343,7 +343,7 @@ class DownloadJob(Job):

     def handle_skip(self):
         pathfmt = self.pathfmt
-        self.out.skip(pathfmt.path)
+        self.out.skip(util.gallery_progress(pathfmt.kwdict) + pathfmt.path)
         if "skip" in self.hooks:
             for callback in self.hooks["skip"]:
                 callback(pathfmt)
diff --git a/gallery_dl/util.py b/gallery_dl/util.py
index e8af358e..65cfe533 100644
--- a/gallery_dl/util.py
+++ b/gallery_dl/util.py
@@ -579,6 +579,12 @@ def chain_predicates(predicates, url, kwdict):
             return False
     return True

+def gallery_progress(kwdict):
+    if "num" not in kwdict or "filecount" not in kwdict:
+        return ""
+
+    return "[{}/{}] ".format(kwdict["num"], kwdict["filecount"])
+

 class RangePredicate():
     """Predicate; True if the current index is in the given range"""
nisehime commented 2 years ago

That was asked already: https://github.com/mikf/gallery-dl/discussions/2222#discussioncomment-2060907

Hrxn commented 2 years ago

So, you managed to find the actual problem here.. it is extractor specific, it only works under very specific circumstances. Here, in this case, when a field like ["filecount"] is provided.

https://github.com/mikf/gallery-dl/blob/d85e66bcacc2df5918024e9313caa6011d8a1d77/gallery_dl/extractor/exhentai.py#L248-L272

See line 268 how this is extracted from the document markup: "filecount" : extr('>Length:</td><td class="gdt2">', ' '),

I think a better way to do this would require adding a feature to keep track of the total item count into the gallery-dl core, and then simply pass back a single number from each extractor. For all extractors where this can be extracted in a straightforward way, it won't just work everywhere...

i-rinat commented 2 years ago

So, you managed to find the actual problem here.. it is extractor specific, it only works under very specific circumstances

Yep. Otherwise it would have been a pull request, not an feature request.

KarelWintersky commented 1 year ago

If field filecount is not available => don't write current progress :)