Closed amotl closed 6 years ago
When working on PatZilla, we actually remember having problems converting TIFF images from the patent data universe with a regular Python PIL library due to some obscure special TIFF features the images might be using.
After digging for the relevant details, we found a comment from the past in the patzilla.util.image.convert.to_png
function:
# Unfortunately, PIL can not handle G4 compression. # Failure: exceptions.IOError: decoder group4 not available # Maybe patch: http://mail.python.org/pipermail/image-sig/2003-July/002354.html
and the file header says it actually is a "bi-level group 4"-type image:
file DATA/Lentille/PatentImages/WO2011015115-1.tiff
TIFF image data, big-endian, compression=bi-level group 4, [...]
To mitigate the issue, we had to resort to the "convert"
tool of ImageMagick fame and never looked back. Let's just go ahead and reuse this recipe from PatZilla in Patent2Net, if you don't have any objections.
hey @amotl , thanks for checking this issue and also making the PR!
I have a compiled Pillow with full support for TIFF images, at least I could make it work with thousands of images that I tested with EPO. But I don't think it's easy to have it compiled in different environments, specially on Windows, so I think your proposal makes sense for P2N.
Hey @rfaga,
good to know this actually is possible with Pillow. Would you mind sharing your installation instructions for others to reproduce? Maybe i will also give it a try.
Otherwise, if you also think using ImageMagick for the thumbnailing task is a more approachable solution for newcomers, let's polish the PR #25 and use it as the default implementation?
I would keep the Pillow-based implementation and maybe add a toggle switch (environment flag) for choosing between both strategies explicitly. Alternatively we can use the ImageMagick-based strategy as a fallback to the Pillow-strategy implicitly.
With kind regards, Andreas.
@amotl I actually think Pillow is distributing a compiled version that works with tiff: https://pypi.python.org/pypi/Pillow/5.0.0
I just typed pip install Pillow
, and doesn't even check for my -dev packages to compile, and according to https://pillow.readthedocs.io/en/latest/releasenotes/5.0.0.html#compressed-tiff-images it's using libtiff. So probably pip install in any env will work.
But I still think the ImageMagick could be a fallback, maybe we could try to convert first with env PIL and, if we get an error, go with ImageMagick for the following tries. What do you think?
Regards, Roberto
I ve experimented ImageMagic 10 years ago. It works fine on all environments and was (at this time) easy to use and configure. But you choose the way
Well. I could manage those files. The error wasn't to handle the tiff files but to save them properly : I had to add a "binary" switch. Now it works fine (for the lentille case almost)
Hi there,
it just happened that i installed a fresh release of my OS, from now i will be using Homebrew instead of Macports and all the jazz under the hood (Xcode, etc.) also is up-to-date right now. So, i will give installing Pillow a try whether it does support compressed TIFF images properly now.
The error wasn't to handle the tiff files but to save them properly : I had to add a "binary" switch.
May i humbly ask which amendments you had to make? Then i would add them to the current stream of the "develop" branch / the amendments of #25. Thanks!
Cheers, Andreas.
Using current software actually solved my problem, Pillow on High Sierra made things work perfectly. Thanks for listening.
Hi there,
we found that
FusionImages.py
fails creating thumbnails. The runbook we used to reproduce the problem is:Recipe
Setup "patent2net" module
For installing the software, please follow the instructions outlined on https://docs.ip-tools.org/patent2net/setup.html.
Setup PIL successor
Acquire images
Attempt to read TIFF image
Exceptions
When using Pillow
When using PIL
Further notices
However, opening the file in question on a Mac OS X machine works fine: