HoverHell / RedditImageGrab

Downloads images from sub-reddits of reddit.com.
GNU General Public License v3.0
311 stars 78 forks source link

Downloads extra pics from imgur albums... #44

Closed gcomyn closed 8 years ago

gcomyn commented 8 years ago

When downloading from imgur albums, there are either one or two extra 'pics' that are downloaded... I've seen this with a album downloader called 'albumr', and if you changed line 125 in the redditdownload.py to the following, it fixes this.

match = re.compile(r'\"hash\":\"(.[^\"]*)\",\"title\"')

since the album title has a hash as well, you have to specify only those with a title after the hash.

rachmadaniHaryono commented 8 years ago

can you give imgurl album example for that?

gcomyn commented 8 years ago

everyone that I downloaded did this....

try this one

http://imgur.com/a/NLMAm

it's from

https://www.reddit.com/r/pic/

I tested it (removed the title from the match) and it downloaded 2 extra files... then I deleted those pics, fixed the py file, ran it again, and it didn't.

rachmadaniHaryono commented 8 years ago

thanks for the example. i have tried the new regex you have given and as you said it give two extra pics

['http://i.imgur.com/2gUGa.jpg', 'http://i.imgur.com/NLMAm.jpg']

but i think this is not a valid image from that album. the first image is actually an album, and the second one is actually have the same imgur id as this current album.

so i don't think it is a bug.

gcomyn commented 8 years ago

they are not valid images for the gallery, so should be removed... if you add the title to the regex, it will not save the extra images, and only get the images from the gallery.

rachmadaniHaryono commented 8 years ago

you are right, i make mistake when testing it. your regex is the one which filtered those two picture. i will make pull request shortly. thank you