rhadamanthe / host-grabber-pp

A web extension, originally designed for Firefox, to find and download media files from various hosts.
MIT License
16 stars 6 forks source link

Naming of download file. Extend strategies to optionally extract filename in addition to image URL #75

Closed ghost closed 3 years ago

ghost commented 4 years ago

This is probably a bigger request. Many image hosts, e.g. imgbox.com mangle the original filename or replace it with an id.

Example (family safe, creative comons): http://imgbox.com/uYfMF6p8 The HTML looks like this: <img alt="Uyfmf6p8 o" class="image-content" id="img" onclick="rs()" src="https://images2.imgbox.com/c0/f1/uYfMF6p8_o.jpg" title="ducklings.jpg"> The name of the download file is cryptic, but the original filename is stated in the title attribute. Other image hosts use other variants.

Imagehost grabber classic returns two strings when analyzing an image page, the URL and the filename.

My idea would be to extend the strategy to return an object containing the URL and optional a filename. If no filename is returned, use the filename from the URL.

rhadamanthe commented 4 years ago

I do not know what to think about this feature request. The example would be easy to treat. But it opens a new door, with much more complex cases: what if the image name is not an attribute but the title of the HTML page? Why not asking a base name in a dialog? It would result in additional options and an even more complex catalog. Is it worth it? :thinking:

I must confess I had not taken the file name into account. It was not an issue for me. I will think about your proposal.

ghost commented 4 years ago

Yes, there could be complex cases where the filename is somewhere on the page. In the cases I know the filename is either in

The correct original upload filename is necessary to sort the files in the right order and it often helps identifying the image. If the filename is random the order is mixed. This is annoying for consecutive image sets (magazines, comics). I am active on a forum where pixhost and imgbox are heavily used to share image sets.

For me (and the other users on that forum) it would suffice to

Imagehost Grabber Classic has the configuration option to prefix all downloads on a page with a left-padded index. This configuration option is probably easier to implement and would really be a great improvement.

Thank you again for your great work and considering this!

rhadamanthe commented 3 years ago

Done. It involved huge changes, but it is somehow more clean now. The next catalog version will have the property set for some hosts.