PhilterPaper / Perl-PDF-Builder

Extended version of the popular PDF::API2 Perl-based PDF library for creating, reading, and modifying PDF documents
https://www.catskilltech.com/FreeSW/product/PDF%2DBuilder/title/PDF%3A%3ABuilder/freeSW_full
Other
6 stars 7 forks source link

[CTS 38] How to handle TIFF? #181

Closed PhilterPaper closed 2 years ago

PhilterPaper commented 2 years ago

Opened 2017 August 12 at 17:28:03 by PhilterPaper

Jeffrey Ratcliffe, Johan Vromans, and I have been discussing what to do to handle TIFF images. The current code is very limited in the types of TIFF files it will handle, and what it does have seems to be a bit buggy. See bugs RT 118047 (#42) and RT 84665 (#2). Jeff has implemented a wrapper package for the libtiff.a library, called Graphics::TIFF, and proposes to replace the current PDF::Builder (as well as PDF::API2) TIFF handling with this new code.

The fly in the ointment is that libtiff.a does not appear to be available on all platforms (Perl distributions). Thus, Graphics::TIFF will not run for some PDF::Builder users. We've been debating what to do about this, and there seem to be at least three choices:

  1. Fix and extend current TIFF code in PDF::Builder -- Pro The code remains pure Perl, and thus usable on any system or distribution. It requires no external packages or prerequisites to install. Con TIFF is hideously complex and can appear in untold permutations of tag types, settings, data layouts, compression methods, and who knows what else. Someone (ultimately me) would be responsible for implementing all this code, testing it, and maintaining it. To be as complete as libtiff.a, this threatens to send us down the proverbial Rabbit Hole. It will be a huge amount of code. Even just fixing the current problems is not trivial.
  2. Use only Graphics::TIFF -- Pro This greatly simplifies the code in PDF::Builder, by throwing off much of the burden of decoding TIFF image files to someone else (the maintainers of libtiff.a). Why reinvent the wheel if someone else has it for a current, maintained project? It should make it much easier for PDF::Builder to handle many more TIFF flavors with (fingers crossed) little or no effort on our part. Con As mentioned up top, libtiff.a does not appear to be available on all systems or in all Perl distributions. It does happen to be shipped with Strawberry Perl, which is what I develop on. As open source, it should be possible to ship its source and build it on systems missing it, but it's questionable as to whether all distributions have the proper tools to compile and link libtiff.a correctly (especially on Windows and on web servers). If Jeff can find a way to build libtiff.a on all systems missing it, we could then be free to prereq Graphics::TIFF for PDF::Builder. Otherwise, there will have to be some sort of "soft fail" if Graphics::TIFF is found to be unavailable on a system. Possibly, we could ship the old (current) TIFF code and allow users to manually install it in lieu of the new Graphics::TIFF-using code. However, it would still be buggy and limited.
  3. Offer both in a Hybrid approach. At runtime, check if Graphics::TIFF is installed. If it is, set a global flag to use it. If it's not, set a global flag to use the old Perl code. A single message to STDERR reminding the user to install Graphics::TIFF if possible, would be output (don't annoy users by having multiple reminders in a run). There might be some sort of call to suppress the message and force one library or the other for those tired of seeing the reminder. -- Pro With both libraries (pure Perl old TIFF code and new Graphics::TIFF-using code) available, all PDF::Builder installations will at least run to some extent. That way, no user should be any worse off than they are today. Con It is unlikely that we will put any major effort into updating the current (old) pure Perl TIFF code, given that the majority of users will probably be able to use Graphics::TIFF. It is possible that we may never even get around to fixing the current bugs, unless someone shows that they're quite simple to fix. To do massive extensions to TIFF is out of the question. Thus, users dependent on the old TIFF code are unfortunately going to find themselves hung out to dry.

Anyone with knowledge of how to build (or obtain) libtiff.a on systems that are missing it, so that Graphics::TIFF can be required, please chime in!

TIFF images are normally going to be a minor player (after JPEG, GIF, and PNG format images), but apparently they are important to some users. We do have to watch the Return on Investment, and not invest an excessive amount of effort into supporting TIFF, given how relatively few users are likely to need and use it.

PhilterPaper commented 2 years ago

Comment 2017 August 14 at 19:03:15 by PhilterPaper

If the current build process (for Graphics::TIFF) doesn't work, and libtiff.a is unavailable...

If this problem with libtiff.a unavailability is confined to Windows, perhaps something could be done with either Windows .DLLs or even .exe (executable commands). The idea is to avoid having to build anything when installing Graphics::TIFF. If we can do something to make use of prebuilt code (.DLL or .exe) rather than having to build in a Perl distribution, that would be good. Windows (DOS) does not have to build executables for each version.

If libtiff.a problems also apply to Linux systems (specific Perl distributions), that may be a problem. While build tools such as compilers are widely available on Linux systems, some systems (such as web servers) may have such tools removed. This would probably make it difficult to install most packages on such systems. Any sort of build might require different source and methods than on a Windows system.

Surely other CPAN packages have faced such issues -- how have they handled it? Once Graphics::TIFF is available, everything should be good.

PhilterPaper commented 2 years ago

Comment 2017 August 22 at 18:27:31 by PhilterPaper

Can anyone help out with this? Does anyone have the requested Windows toolchain?

Hi Phil,

On 22/08/17 14:54, Phil M Perry wrote: "I just wanted to check in and see what your plans were regarding Graphics::TIFF, such as whether you're going to try to get it installed on any system (possibly shipping or building libtiff in the install), or if I should assume there may be some systems it won't be available on."

My trouble is: 1) In order to ship and build libtiff with Graphics::TIFF for Windows, I would have to set up a Windows toolchain, which I don't have. 2)Even if I did, it wouldn't get us much more information about how many more systems were supported, as the only ones that seem to be smokers on CPAN are Strawberry Perl, where the 5.26 build works anyway, and anything before that cannot work until the MakeMaker bug is fixed, due to dmake not liking the makefiles that MakeMakers produces. So - if someone finds a way of making the libtiff/Graphics::TIFF build work on other Windows systems, I am happy to upload it. Otherwise, I have more important things to spent my time on than operating systems I'm not interested in and things I am not sure can be made to work.

In that case, should all TIFF function be disabled (i.e., it's currently so broken it's not worth even offering the current code) or should I give users at least the current code. If more than 95% of Perl systems would be able to install Graphics::TIFF, we might just write off the remaining <5% (Tough luck!). Do you have evidence that enough of current TIFF code is broken that there's no point in shipping any of it, or is it just a few functions (that might be removed)?

Evidently the G3 and G4 code is totally broken and the LZW decoder is buggy. Uncompressed or flate seems to be OK. My feeling would be to make TIFF support dependent upon Graphics::TIFF, but then I am biased because on POSIX systems, it all just works. Obviously in the end it is your decision. Regards, Jeff

So, it appears that most (but not all) of current TIFF support is broken (does anyone dispute this?). We have no idea how many systems that Graphics::TIFF can (or can't) run on (libtiff.a is needed). It is unknown whether libtiff code could be built directly into Graphics::TIFF -- presumably if we ship libtiff source for building into Graphics::TIFF, we could simply build libtiff.a by itself (even on Windows). Strawberry Perl (Windows) and all Linux systems (?) include libtiff.a, and only needs a thin Graphics::TIFF wrapper.

It's not reasonable to fix and extend the TIFF code in pure Perl, to cover all the function of Graphics::TIFF, so our choice (if Graphics::TIFF is not available) is to

  1. simply drop TIFF support
  2. dynamically switch to the existing broken/limited TIFF support
  3. dynamically switch to working subset of existing TIFF support (remove non-working functionality)

Graphics::TIFF will not be an installation prerequisite for PDF::Builder, until such time as we can be sure that it can be made available on all Perl installations.

Unless something major happens in the few weeks, I do not expect to see any TIFF changes in the upcoming PDF::Builder 3.006 release. Maybe something will be done for the later 3.007 release.

PhilterPaper commented 2 years ago

Comment 2017 August 23 10:26:01 by sciurius

Are there any outstanding complaints about problems with TIFF? I have the strong feeling no one uses it. If so, just leave everything as it is now and add a note that TIFF support may be incomplete and/or buggy.

It is a pity to spend much time on something that will not be used. BTW: This applies to PDF::Builder. Having a libtiff wrapper in the form of Graphics::TIFF is good, but for other purposes.

PhilterPaper commented 2 years ago

Comment 2017 August 23 10:50:34 by PhilterPaper

Well, there are a couple of open bug reports (84665/#2 and 118047/#42), and Jeff has put in quite a bit of effort to use libtiff (via Graphics::TIFF wrapper). I agree that probably very few people use TIFF images, but for them, it's apparently very important. I do have to be careful about going down the Rabbit Hole of putting far too much time and effort into this, for the return we get.

Putting in usage of Graphics::TIFF for systems with libtiff seems to be a no-brainer. I'm hoping that Jeff or someone could do something to ensure that Graphics::TIFF is available on all systems (i.e., building it on the fly). However, he doesn't seem to be interested in that sort of effort, in which case someone else might be interested in carrying the torch. The big question seems to be whether I should make some effort to use the existing TIFF code (should Graphics::TIFF not be installed), possibly removing function known to be broken. My preference would be to prereq Graphics::TIFF and just use Jeff's new code, assuming that it can be made available on all systems. If it can\'t, I don't think it's worth trying to do a lot of work on a pure Perl solution, unless it turns out that a major fraction of Perl installations can't install Graphics::TIFF.

I have other things to concentrate on right now, but plan to return to the TIFF issue in a month or so. I will look at what it will take to use the new Graphics::TIFF code if that library is installed, otherwise to just use the current code (rather than removing it, as Jeff suggests). At least, no one will lose current working functionality. Graphics::TIFF will not be a hard prereq unless someone comes up with a way to ensure that all Perl installations can install it.

PhilterPaper commented 2 years ago

Comment 2017 November 26 at 16:51:21 by PhilterPaper

I think I have TIFF support pretty much squared away (RT 118047 and RT 84665). I used Jeffrey Ratcliffe's new library (Graphics::TIFF, a wrapper around libtiff.a), while keeping the old code as a fallback in case Graphics::TIFF isn't installed or the user wishes to not use it for some reason (detected at runtime, so no manually swapping libraries). In CCITT Group 3 and Group 4 fax, I had to flip bytes around if the fill order was 2 (Lsb first) and for some reason the BlackIs1 flag has to be flipped around. It seems to work on the ones I've tried, but I'm sure there are many more TIFF architectures that will need work over time.

PhilterPaper commented 2 years ago

Comment 2018 September 04 at 12:45:01 by PhilterPaper

A quick note: as far as I can tell, the only functional regression in using Graphics::TIFF rather than the pure Perl code (in PDF::Builder) is that only a filename can be given: no file handles are accepted for file operations. If I can find a way to close the file and get its path and name to pass into the Graphics::TIFF routine, that might work. A very low priority work item for now...