4pr0n / ripme

Downloads albums in bulk
MIT License
916 stars 204 forks source link

Create filter options where you can specify a kind of link to look for and rip #68

Open metaprime opened 10 years ago

metaprime commented 10 years ago

For instance, a lot of websites link to generic image hosts like imagebam or imagevenue, which have their own web pages wrapped around the actual images. Rather than writing rippers for these specific webpages (which are all over the place and sometimes forums or similar), provide the option to search a page for links to these image hosts, and then extract the images from the linked pages.

At the least, maybe there should be APIs built in to the application to extract image links from these image host pages, to make writing rippers for various websites easier.

I think the option to search pages for certain types of links might be a version 1.1 kind of update, but the internal API for help writing rippers could be added right away.

Note that some hosting sites might be URL forwarding sites, or sites which display ads, and may require retrying a link or "clicking through".

4pr0n commented 10 years ago

I tried this in rip v2, trying to grab gallery-dump albums which use various spammy image hosts

https://github.com/4pr0n/rip/blob/master/sites/site_gallerydump.py#L63

It was ugly. I think I could do it if there's static methods in each ripper to handle individual URLs.

I get that you're saying it could rip any site & handle the image links inside the site. This is a good idea but will require some thought for design & implementation.

4pr0n commented 10 years ago

Note: similar request in #70

metaprime commented 10 years ago

DownThemAll (FF plugin) has a nice scheme where it will scan sites for certain types of media and then show you a list of media you could possibly download, with the obvious images pre-selected. We could do the same kind of thing for these kinds of links, maybe?