Open bharathyes opened 9 years ago
I have a few concerns about using image titles as filenames:
1 - Most images I've seen uploaded to imgur don't have titles. So do we just fall back to the hex for them? What happens if some images have titles and others don't, that's kinda weird in the downloaded folder.
2 - Suitably escaping the titles so that they're safe to use as filenames across operating systems (my python is a little sketchy, not sure how much os
does to help with this)
3 - Scope creep. This is meant to be a quick and simple grabber, not a fully featured API client, and part of me wants to keep it that way simply to be a good citizen to imgur. If we start implementing a tonne of features that mean you can circumvent using the real API for reads, that feels a bit wrong to me. Means imgur can't rate limit / monitor usage of their services via the API key mechanism. Might sound a little ironic, me suggesting that having written this lib in the first place, but I do believe there's a certain line between "quick and dirty tool with minimal impact" and "deliberately trying to break the rules".
Interested to hear discussion on this though, if it's a feature a lot of people want, I'd consider accepting a pull request for it (with points 1 and 2 suitably addressed), but I'm not sure I fancy writing it.
I'd personally prefer if that feature is kept out of the main tool. A third-party script to retrieve page names from the image filename should be pretty easy to write. It could be linkd from the README or added as a separate file in this repo
@alexgisby Responses to your primary concerns:
@alexgisby and @nodiscc As I mentioned, I am new to Python and thus would really appreciate it if you implement this feature or maybe point me to resources that could help me to create them myself. As @nodiscc suggested, it could be added as an extension to the original script,
The titles-as-filenames problem isn't so much encoding, it's special characters. Slashes can often be problematic, as can colons. Then you get onto the filename length problem, different OS's have different filename length issues, and now you need to be truncating correctly and would anyone actually want a 255 (or more) character long filename in their download directory?
The script currently allows you to chose the name of the directory you download into (second parameter on the command line), and so the user can choose a directory name that makes sense to them, rather than this script attempting to guess at it.
We definitely don't have access to the original filenames they were uploaded as from this tool. All this script does is scrape the HTML output for tags, and so we don't know anything apart from the image ID (the hex value). Getting the original filename would be an API call, which requires an API key and is definitely out of scope for this tool.
@alexgisby I understand the complexities involved. Just wanted to know a method of giving meaningful names to those images stored. Also the reason I mentioned about the .zip download was to say that Imgur themselves are okay with us downloading public galleries and wasn't asking to implement the same here.
Anyway thanks for the quick replies and more importantly for sharing your code.
I wanted this for album titles (but not images), so I've put up PR #35.
Would be really nice if the title of the image were used as the images' filename instead of the HEX filename used by Imgur. Would be really useful in making sense of the downloaded files.