mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.7k stars 953 forks source link

Can't download vk.com albums now #2512

Closed elias137 closed 1 year ago

elias137 commented 2 years ago

Hi. I used to download vk.com albums with gallery-dl some time ago, but now it seems something went wrong and I'm unable to do it. Can you please shed some light on what's going on? gallery-dl installed on macOS 10.13.6 via $ python3 -m pip install -U gallery-d Screen Shot 2022-04-19 at 18 18 41 l

AlttiRi commented 2 years ago

It looks removing ?size=200x133&quality=96&type=album from the thumbnail URL (taken from HTML) no more works. (Although, it didn't seem to work before, based on my old tests). Probably it's my wrong suggestion, how gallery-dl works, based on -v log.

Anyway today VK have changed the image endpoints.

It was:

Now:

abutohan commented 2 years ago

Currently having the same problem. I know the solution is to parse the last three parameters (size, quality and type) then concatenate it back to the image_url, but I don't know the in and outs of the source code since i'm just a average enjoyer here in github. Really hope the devs will fix it.

elias137 commented 2 years ago

Anyway today VK have changed the image endpoints.

It was:

  • https://sun1-11.userapi.com/impf/meDIUmHASH/sh0rtID.jpg?size=1380x1379&quality=95&sign=ABCD1234&type=album

Now:

  • https://sun1-11.userapi.com/s/v1/if2/VeryLongHASH.jpg?size=1380x1379&quality=95&type=album

I wonder if it possible to make changes to gallery-dl so it would parse it correct?

abutohan commented 2 years ago

After hours of scrubbing through the source code, was able to debug and download some images at super low quality since the album where it is fetching has a fixed width and height.

image

Here's the div where the photos are being fetched. Initially, the part size=200x313&quality=95&type=album is omitted by the source code and working fine, but as what AlttiRi said, vk change it's image end point and omitting the last parameters is now no longer working.

image

Edit:

AlttiRi commented 2 years ago

~Well, it can be fixed with one extra call for an image — fetching of a post URL (https://vk.com/photo-1234_1234, however, it is almost 0.8 MB of HTML (ungziped).).~

(Without using VK API)

UPD. It's better to use https://vk.com/al_photos.php?act=show with the corresponding "Form Data" content.

let {payload: [_zero, [list, total, offset, images, _extra]]} = JSON.parse(new TextDecoder("windows-1251").decode(await (await fetch("https://vk.com/al_photos.php?act=show", {
  "headers": {
    "content-type": "application/x-www-form-urlencoded",
    "x-requested-with": "XMLHttpRequest"
  },
  "body": "act=show&al=1&direction=1&list=album-29937425_0&offset=0", // -159293555_272628467 NSFW
  "method": "POST",
})).arrayBuffer()));
console.log({list, total, offset, images});

It returns info for 10 images.

Then you need to find a url with bigger image dimensions "w_src", "z_src", "y_src", "x_src", "r_src", "q_src", "p_src", "o_src" ("w_", "z_", "y_", "x_", "r_", "q_", "p_", "o_") properties.

Image types (from bigger to lower size): w, z, y, r*, q*, p*, o*, x, m, s (* can be cropped).

Note: vk.com returns answers with content-type: application/json; charset=windows-1251


Upd 2022.04.22:

BTW, the rendered HTML is not convenient to use, so it's possible to parse the original text this way:

import xml.dom.minidom

def getText(nodelist):
    rc = []
    for node in nodelist:
        if node.nodeType == node.TEXT_NODE:
            rc.append(node.data)
    return "".join(rc)

author = '<a href="/cosplayinrussia" class="group_link">Косплей | Cosplay 18+</a>'
date = '<span class="rel_date">8 Jun 2021</span>'

a = xml.dom.minidom.parseString(author).getElementsByTagName("a")[0]
s = xml.dom.minidom.parseString(date).getElementsByTagName("span")[0]

authorText = getText(a.childNodes)
dateText = getText(s.childNodes)

print(authorText)  # Косплей | Cosplay 18+
print(dateText)  # 8 Jun 2021

https://docs.python.org/3/library/xml.dom.minidom.html

ImVantexHD commented 2 years ago

so is not possible anymore to get the original size images? i mean if vk served the original before there must still be a way to request it right? has anyone looked at the mobile API? is it using the same image endpoint as the web one?

AlttiRi commented 2 years ago

I don't think that it was possible to download the "original" uploaded images before. Most likely the old way also downloaded images with the best available quality, but not the original. Re-download some old downloaded images and compare them.


As far as I know VK always re-encodes images/videos even if the result's size will be larger. Any jpeg re-encoding — loosing a quality.