mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.7k stars 953 forks source link

nhentai saves images as the wrong file extension #63

Closed RitoPls closed 6 years ago

RitoPls commented 6 years ago

It seems to occur with most or all of the comics I downloaded from nhentai. If you need an example for a test case you could use (NSFW) https://nhentai.net/g/219213/

Win8 Python 3.5 gallery-dl 1.1.1

I open up one of the PNG files it downloaded with infranview and it gives me a warning telling me to rename the file from .png to .jpg

It's still viewable as *.png , but the warnings gets annoying in infranview.

mikf commented 6 years ago

nhentai advertises these files as *.png in every way possible:

It is obviously a `.jpg file, but gallery-dl is only doing what it is being told by nhentai.

A possible solution would be to look at the first few bytes when downloading and see if the file header matches the suggested filename extension, but implementing this properly might take a bit.

Bfgeshka commented 6 years ago

You probably do not have to read bytes and compare them manually, doesn't python have proper mime support?

mikf commented 6 years ago

There are Python wrappers for libmagic, but they are not part of the standard library. There you only have the mimetypes module which maps MIME-types to file extensions and vice versa.

Manually supporting the 3 or 4 most common image types shouldn't be a problem, it's more about how to properly fit this into the current download implementation.