Closed pmeier closed 5 years ago
To give you a quick reply, there is another issue like this - #3833 - and as in that issue, I would suspect that you have different versions of one of the Pillow dependencies installed on the different machines.
To try and understand your problem better, is there a reason you need to read the images and create hashes of that data, instead of just reading the file contents and comparing those hashes?
I would suspect that you have different versions of one of the Pillow dependencies installed on the different machines.
I followed this advice to get the versions for Ubuntu.
libjpeg-3b10b538.so.9.3.0 => /usr/local/lib/python3.5/dist-packages/PIL/./.libs/libjpeg-3b10b538.so.9.3.0 (0x00007fc86404f000)
libopenjp2-b3d7668a.so.2.3.1 => /usr/local/lib/python3.5/dist-packages/PIL/./.libs/libopenjp2-b3d7668a.so.2.3.1 (0x00007fc863dd8000)
libz-a147dcb0.so.1.2.3 => /usr/local/lib/python3.5/dist-packages/PIL/./.libs/libz-a147dcb0.so.1.2.3 (0x00007fc863bc3000)
libtiff-8267adfe.so.5.4.0 => /usr/local/lib/python3.5/dist-packages/PIL/./.libs/libtiff-8267adfe.so.5.4.0 (0x00007fc863928000)
I didn't find a way to do the same on Windows. Is there one?
is there a reason you need to read the images and create hashes of that data, instead of just reading the file contents and comparing those hashes?
I'm not sure if I got your question right. What is the difference between "read the images" and "reading the file contents"? Ultimately I want the following functionality:
I have a image database containing among others an URL and MD5 for each entry. I have a python script that does this:
I think I now know where you getting at. I changed to download to
with open(fpath, "wb") as fh:
fh.write(requests.get(url).content)
and this works on both my test platforms. I did the detour over PIL
to be able to save all images as JPEG. The disk space I would save by this is not worth the hassle of fiddling with the system libraries. Thus, I'm closing this. Thanks for the swift support.
You might want to consider answering the former question about how to find the library versions on Windows if you know an answer to that. It could come in handy for future clueless users like me.
What did you do?
As part of a project I'm downloading a large number of images from the internet with the
requests
library and save them withPIL
(pillow
). In avoid re-downloading them every time I run this, I hard coded their MD5 hashes and check if they match beforehand.What did you expect to happen?
I was expecting this to work independent of the machine and OS, since the MD5 is invariant to this.
What actually happened?
I developed on Ubuntu and everything works as expected. Today I tried it on Windows, but the MD5 hashes calculated on Ubuntu didn't match. I investigated a little further and found that the images differ slightly in the number of bytes on disk:
Ubuntu output:
Windows output:
PIL
?What are your OS, Python and Pillow versions?
I also tried this on https://repl.it/languages/Python3 which runs "Linux" and python3.7.4. This results in the same output as my Ubuntu machine. Unfortunately I don't have any other setups to test this further.