Add support for saving multipage images, esp. TIFF and PDF

vashek commented 10 years ago

While reading multiple frames/pages is supported, I couldn't find a way to write them, despite some mentions in the code and in the changelog of version 1.1.7c1.

vashek commented 9 years ago

If anyone needs it, I have solved the need for myself. PDFs by calling tiff2pdf, but more interestingly, TIFFs by creating a "file object wrapper class" that is able to append a TIF to an existing one (it simulates a new empty file for Pillow to write into, while actually appending to an existing TIF and fixing the IFDs and IFD pointer). Kind of a hackish solution that I'm not very proud of but if anyone needs it, feel free to ask.

radarhere commented 9 years ago

Thinking about this, would it be good for Pillow to open a multipage/frame image and then save it again with all those frames by default - to be able to roundtrip the image without losing all but the first frame?

Or would that be considered too much of a change from the existing behaviour?

vashek commented 9 years ago

I would have liked that if that was the declared behavior from the start, but as you say, I think this would now be too much of a change. I would prefer adding a save_all method or an optional parameter to save or something like that.

radarhere commented 9 years ago

Fair enough, and others are highly likely to be thinking the same way. Does anyone else have any thoughts as to what form this new api should take? My current thoughts are of copying register_save into register_save_all, and that users would pass in save_all as part of the params variable in Image's existing save method.

radarhere commented 9 years ago

Considering that Pillow doesn't presently read or write PDFs of any kind, I think that multipage PDFs are perhaps a bit beyond the current state. GIF multipage saving has been supported more or less by Pillow for a long while, and it has just been a question of merging that functionality into the main system.

You mention TIFFs. Are there any other formats that would be considered useful?

vashek commented 9 years ago

PDF writing is supported (in a basic way), is it not? The only three formats that I can think of are GIF, TIFF and PDF, but then again I am not an expert on the multitude of other possible formats.

radarhere commented 9 years ago

Ah, yes, you're right, PDF writing is supported. Thanks for pointing that out.

jkfindeisen commented 8 years ago

What about TIFF multipage saving? Is it supported by now? Or what would have to be done in order to support it in a similar API like PDF multipage saving?

hugovk commented 8 years ago

@jkfindeisen Multipage saving was added #1445 which was released in the latest Pillow 3.0.0.

I'll close this issue, if there's something still pending, we can re-open it.

jkfindeisen commented 8 years ago

@hugovk Does #1445 really implement multipage writing for TIFFs?

In the commit in #1445 is no reference to TIFF. Also in the current master https://github.com/python-pillow/Pillow/blob/master/PIL/TiffImagePlugin.py there is no save_all method which I would expect if multipage writing for TIFFs would be implemented.

hugovk commented 8 years ago

@jkfindeisen Apologies, that's only for PDF. Sorry for the confusion!

radarhere commented 8 years ago

Hi. To add support for TIFF multipage saving, it's a matter of reading the TIFF specification and making the necessary changes to TiffImagePlugin. I have tried previously, but without success.

If you would like to have a shot at it, you're welcome to do so.

jkfindeisen commented 8 years ago

Hi, I will try it. I might come back when I have something.

Koudy commented 8 years ago

@hugovk, what about TIFF multipage saving? Still not done?

hugovk commented 8 years ago

@Koudy Still not done. PRs welcome!

Koudy commented 8 years ago

@hugovk, unfortunately I not a python developer. Just need that function.

piranna commented 8 years ago

Any advance on that? With some guidance I would be able to take a look on this...

lambdafu commented 8 years ago

@vashek Do you still have your "hackish code" for multi-page TIFFs? The approach doesn't sound too hackish at all, but somewhat appropriate, and I think we should try get it cleaned up and merged.

You wrote: "TIFFs by creating a "file object wrapper class" that is able to append a TIF to an existing one (it simulates a new empty file for Pillow to write into, while actually appending to an existing TIF and fixing the IFDs and IFD pointer)."

vashek commented 8 years ago

Yes, and it's been working for me without problems so far. Here you are. Sorry about the lack of comments. multiPageTiff.zip

piranna commented 8 years ago

@vashek, could you be able to provide a example of how to use it? Do you mind to upload somewhere on GitHub or Gist so I could be able to add the comments, or is it fine if I upload them here?

piranna commented 8 years ago

I have clean it and uploaded to https://gist.github.com/piranna/4c0787a12cf5885eb36a5b561e3347fc if you are interested in merge it upstream or anybody wants to use it. It seems it can be used both opening a TIFF file and write the content of another "by hand" and by opening a TIFF file and using the returned file descriptor as the one where Pillow will automatically write the new TIFF pages, isn't it?

lambdafu commented 8 years ago

@piranna Thanks for the cleanup (I reviewed it and can confirm it's equivalent to vashek's code). I don't know where logging.trace is coming from, and I had to add an UTF-8 marker. Finally I added support for filehandles instead of filenames (so you can use BytesIO, for example) and a simple test routine.

Running this on Pillow/Tests/images/multipage.tiff shows corruption in the third image, though, so it's debug time :cactus:

https://gist.github.com/lambdafu/fc3ef0863d6b986bc74030f92bb6154f

lambdafu commented 8 years ago

Now I am eating my words, the original code works fine, so they are not 100% equivalent.

piranna commented 8 years ago

Hum, maybe I added a typo during clean-up stage? Good you added tests! :-D

lambdafu commented 8 years ago

tag in self.Tags on Enum is always False, so I changed it to tag in set(self.Tags) and IntEnum. Now it's working: https://gist.github.com/lambdafu/fc3ef0863d6b986bc74030f92bb6154f

I guess a save_all function is now easy to implement. But there is no interface in PIL to create multi-frame Image objects...

piranna commented 8 years ago

I added support for filehandles instead of filenames (so you can use BytesIO, for example)

That's a great adition :-D

tag in self.Tags on Enum is always False, so I changed it to tag in set(self.Tags) and IntEnum.

That's curious, it was working for me in other places, I'll need to review them... Good idea to use IntEnum instead.

I guess a save_all function is now easy to implement. But there is no interface in PIL to create multi-frame Image objects...

Maybe is it done using extra dimensions on the image?

lambdafu commented 8 years ago

Image.new("RGB", (3, 100, 100)) doesn't work. But more importantly, this would mean all frames must have the same dimension, something that is not true for multipage-TIFF files.

Maybe PIL should just export the AppendingTiffWriter object for now.

piranna commented 8 years ago

But more importantly, this would mean all frames must have the same dimension, something that is not true for multipage-TIFF files

I'm not sure if this is really needed, dimensions are just defined as a tuple... Are you sure that all the fields need to maintain their range? If so, maybe they are filled with null data?

P.D.: I've fixed two minor PEP8 things.

lambdafu commented 8 years ago

Well, fact is that PIL right now can load existing multiframe TIFF files but not create or save them. With the code from the gist, we can implement save_all to save them. But it will still not be able to create them from scratch. The AppendingTiffWriter object can be used to create multiframe TIFFs from scratch, but that's still not a generic API for multiframe Image objects that can be used for GIF and PDF as well.

What I was pointing out is that a multiframe Image should be able to have frames with different xy-sizes, so a single (nrframes, x, y) argument to Image.new to create arbitrary multiframe images would not be sufficient. For example, the multipage.tiff from the test suite has image sizes (10,10), (10,10), (20,20), so it could not be created with such a simple interface.

However, there are plenty of ways to create an appropriately powerful interface, and I don't think we should wait for that to be written before working on save_all and exporting the class to at least allow users to create multiframe TIFF files (even if it is TIFF specific). With only save_all and not exporting AppendingTiffWriter, everybody who needs more than save_all will have to duplicate the code in the gist.

lambdafu commented 8 years ago

Ok, hopefully we can make progress on this one now :)

vashek commented 8 years ago

Great, thanks guys, I'll be happy if my code makes it to Pillow and becomes useful for others. One comment on the improvements made to AppendingTiffWriter by @piranna / @lambdafu: I see that you're using enum, which is a beautiful solution that I look forward to using myself as well, but it's only supported by Python from version 3.4, so I don't think you can use it in Pillow. Am I missing something?

Also: sorry about the logging.trace - that was from my own private addition to the standard logging module that I made for extra-verbose messages that I felt were too much even for the debug level. :) Either delete it altogether, or change it to the way other debugging messages work in TiffImagePlugin. (Which, incidentally, should be changed to the standard logging interface eventually, though.)

piranna commented 8 years ago

However, there are plenty of ways to create an appropriately powerful interface, and I don't think we should wait for that to be written before working on save_all and exporting the class to at least allow users to create multiframe TIFF files (even if it is TIFF specific). With only save_all and not exporting AppendingTiffWriter, everybody who needs more than save_all will have to duplicate the code in the gist.

I agree on that, we could add this functionality to TIFF images and later move on to a generic solution when solutions for other formats appear. We could also add AppendingTiffWriter as a helper class until that happens, too...

I see that you're using enum, which is a beautiful solution that I look forward to using myself as well, but it's only supported by Python from version 3.4, so I don't think you can use it in Pillow. Am I missing something?

You have it available too on the enum34 module.

lambdafu commented 8 years ago

Removing enum helped to pass travis on Python 2.7. Everything else still fails. I can reproduce Py3 failure locally, though.

vashek commented 8 years ago

@lambdafu: Ah, makes sense, I don't think I've ever tried my class with Python 3.x, I'm still tied to Py 2.7 in the project I made it for. Sorry again, I should have warned you. :(

piranna commented 8 years ago

So the problem with the third TIFF image only happens on Python3? I haven't tested yet the code, sorry :-(

lambdafu commented 8 years ago

@vashek No worries, that's entirely expected. I am tied to Py2.7, too, but I have leeway to contribute upstream, which is of course in everybodies interest. You saved me a ton of time by doing the TIFF work. In comparison, playing table tennis with travis is relaxing :)

lambdafu commented 8 years ago

@piranna the third-image corruption was just the "in self.Tags" problem. By now, I have found and fixed a bug in palette reload #2139 and am now massaging the code for compatibility.

piranna commented 8 years ago

Great! :-)

lambdafu commented 8 years ago

Ok, I squashed all changes and we should now hit all-green on Windows and Unix. Thanks, folks!

python-pillow / Pillow

Add support for saving multipage images, esp. TIFF and PDF #733