alexgisby / imgur-album-downloader

Python script/class to download an entire Imgur album in one go into a folder of your choice.
MIT License
340 stars 59 forks source link

Image titles as filename of the downloaded images. #18

Open bharathyes opened 9 years ago

bharathyes commented 9 years ago

Would be really nice if the title of the image were used as the images' filename instead of the HEX filename used by Imgur. Would be really useful in making sense of the downloaded files.

alexgisby commented 9 years ago

I have a few concerns about using image titles as filenames:

1 - Most images I've seen uploaded to imgur don't have titles. So do we just fall back to the hex for them? What happens if some images have titles and others don't, that's kinda weird in the downloaded folder.

2 - Suitably escaping the titles so that they're safe to use as filenames across operating systems (my python is a little sketchy, not sure how much os does to help with this)

3 - Scope creep. This is meant to be a quick and simple grabber, not a fully featured API client, and part of me wants to keep it that way simply to be a good citizen to imgur. If we start implementing a tonne of features that mean you can circumvent using the real API for reads, that feels a bit wrong to me. Means imgur can't rate limit / monitor usage of their services via the API key mechanism. Might sound a little ironic, me suggesting that having written this lib in the first place, but I do believe there's a certain line between "quick and dirty tool with minimal impact" and "deliberately trying to break the rules".

Interested to hear discussion on this though, if it's a feature a lot of people want, I'd consider accepting a pull request for it (with points 1 and 2 suitably addressed), but I'm not sure I fancy writing it.

nodiscc commented 9 years ago

I'd personally prefer if that feature is kept out of the main tool. A third-party script to retrieve page names from the image filename should be pretty easy to write. It could be linkd from the README or added as a separate file in this repo

bharathyes commented 9 years ago

@alexgisby Responses to your primary concerns:

  1. Even though many images do not contain title, most of them contains an album name. So in the absence of image titles, the album name with a serial suffix can be substituted and further in the absence of an album title, the HEX value can be used.
  2. I am new to Python programming, but can you somehow use UNICODE for naming to enable uniformity and also support multiple languages.
  3. I understand your concerns of cluttering this simple tool. I wanted to say that this would be a nice addition. Also, Imgur allows users to download public galleries as .zip files and their filenames aren't HEX values. Instead I think they are the original filenames at the time of their upload. But am not sure.

@alexgisby and @nodiscc As I mentioned, I am new to Python and thus would really appreciate it if you implement this feature or maybe point me to resources that could help me to create them myself. As @nodiscc suggested, it could be added as an extension to the original script,

alexgisby commented 9 years ago

The titles-as-filenames problem isn't so much encoding, it's special characters. Slashes can often be problematic, as can colons. Then you get onto the filename length problem, different OS's have different filename length issues, and now you need to be truncating correctly and would anyone actually want a 255 (or more) character long filename in their download directory?

The script currently allows you to chose the name of the directory you download into (second parameter on the command line), and so the user can choose a directory name that makes sense to them, rather than this script attempting to guess at it.

We definitely don't have access to the original filenames they were uploaded as from this tool. All this script does is scrape the HTML output for tags, and so we don't know anything apart from the image ID (the hex value). Getting the original filename would be an API call, which requires an API key and is definitely out of scope for this tool.

bharathyes commented 9 years ago

@alexgisby I understand the complexities involved. Just wanted to know a method of giving meaningful names to those images stored. Also the reason I mentioned about the .zip download was to say that Imgur themselves are okay with us downloading public galleries and wasn't asking to implement the same here.

Anyway thanks for the quick replies and more importantly for sharing your code.

greyfade commented 7 years ago

I wanted this for album titles (but not images), so I've put up PR #35.