Simon-Tesla / RaccoonyWebEx

A WebExtension that adds shiny features to art sites
MIT License
43 stars 4 forks source link

Download all images in a multi-image submission #111

Open Simon-Tesla opened 3 years ago

Simon-Tesla commented 3 years ago

This is related to the request in #87 but I'm splitting off this as a separate feature, since I'd want to tackle it separately.

Some sites like InkBunny and Pixiv support a submission type that has multiple downloadable images or other artifacts. It would be nice to enable the scenario where all of the downloadable items in that submission can be downloaded at once.

Note that Pixiv displays all images in a single page while IB spreads them across multiple pages. Pixiv would fall more naturally within Raccoony's current model of operation where it scrapes everything it needs from the page its on; it's not clear that would be possible in IB, which may preclude it from getting support for this, and people will instead need to continue to use the existing open-all-in-tabs feature for these submissions.

GreenReaper commented 9 months ago

You might be able to use the filename of the page thumbnail, appropriately modified from thumbnails/medium to files/full. Bear in mind that PNG images may have JPG thumbnails, and you would also have to trim off any '_nocustom' before the extension. There are cases where this does not work, e.g. images80/overlays/writing.png for a text or JSON file.

Joebugg commented 4 months ago

Hmm, the KISS solution of just iteratively opening other tabs seems like a workable workaround, but yeah, when I use it on IB, I just click 'next' on specific submissions until done with all images. Doing it from just the thumbnail URLs on the first page of the submission seems fragile even it is made to work.

Joebugg commented 2 months ago

I'm wondering how I'd add "set" functionality to site gallery plugins. Would it be sufficient to indicate what div has the "next" button to search for? Or in IB's case, all pages after the first have a number appended to the same pattern, so you can just read the number of pages and use a simple for loop to add them to the open-tabs queue? The latter seems far more resilient/stable in case of site not reorganizing, but the former seems more stable if they change the URL format.

Simon-Tesla commented 2 months ago

Right now, the IB plugin as implemented treats a multi-image submission as a small gallery of its own and you can use the existing 'open all in tabs' functionality to kinda work around this to get a rough approximation of the feature. That is one of the reasons why I say this may only make sense in the context of sites that have the full set of images referenced on one page (and not just links to other pages to view them, like with IB).

There currently isn't any infra for downloading multiple images at once even in the case where you have all the URLs, but that is definitely going to be the easier case to implement; it'd mostly be extending the getMedia() interface to allow an array of Media objects to be returned and handled appropriately. So pixiv wouldn't be too hard to do once the scraper was written, most likely.

The problem is multipage submissions; in theory if the format of the image URLs is easy to generate based on the page links that'd be an option but in general that is suuper prone to being brittle, even moreso than scraping the DOM as-is, which is why I try to always get the actual URLs from the DOM directly. I could see some sort of a semi-automated system that pages through the submission to download it, but that requires Raccoony to track some state between page loads in addition to whatever basic support logic is needed, which could end up being fairly complex to implement.

Simon-Tesla commented 2 months ago

Another option would be to have Raccoony generate its own API calls to get these lists, assuming the site supports such an API and it's accessible with the user's cookies, but that's its own can of worms and from what I can tell not often the case for places where it would be useful. (IB requires a separate login for its API and it has to be enabled by the user in the IB settings, for example, both of which I view as non-starters for Raccoony support. The idea of having Raccoony store any kind of authentication info gives me the heebie-jeebies, even for sites that implement this as an API token as is best practice instead of passing around a raw username/password combo for it.)