jtara1 / imgur_downloader

Python script/class to download an entire Imgur album in one go into a folder of your choice.
MIT License
40 stars 7 forks source link

Exception: Failed to find regex match in html #26

Closed specu closed 1 year ago

specu commented 3 years ago
from imgur_downloader import ImgurDownloader
downloader = ImgurDownloader('https://imgur.com/EbK0erY', '.')

results in

    downloader = ImgurDownloader('https://imgur.com/EbK0erY', '.')
  File "/usr/local/lib/python3.7/dist-packages/imgur_downloader/imgurdownloader.py", line 157, in __init__
    self.json_imageIDs = list(self._init_image_ids_with_json(html=html))
  File "/usr/local/lib/python3.7/dist-packages/imgur_downloader/imgurdownloader.py", line 189, in _init_image_ids_with_json
    raise Exception("Failed to find regex match in html")
Exception: Failed to find regex match in html

version: imgur-downloader==0.2.2

wyatt8740 commented 3 years ago

I think Imgur changed how/where they put the JSON string containing the image list in their album pages.

I was able to make the script work again by changing the following in imgur_downloader/imgurdownloader.py:

search = re.search('(item:.*?};)', html, flags=re.DOTALL)

to:

search = re.search('image *: *{.*', html)

and a few lines later:

json_search = json.loads(search)

to:

json_search = json.loads(search+'}')

The second change is necessary because my new regex apparently ends up causing the omission of the final curly brace necessary to make it a valid JSON string. And I'm lazy.

This is not thoroughly tested (I used it on exactly one gallery). But you might want to try that. I'd make a proper patch file, but my local version has other edits (for cookie authentication and such) and I just wanted to throw my answer out there before i went to bed.

jtara1 commented 1 year ago

gallery-dl supports downloading from the variety of links this project supports. I'm going to archive this repo.