Closed jasonwpalmer closed 12 months ago
Arrright,
Sooo... There's XMP involved. It might actually be easier to just use Adobe's XMP toolkit. It's actually (believe it or not) BSD licensed. At least have a look at it.
Anyway, the XMP data is in a separate IRB (see Image resource IDs and the com.twelvemonkeys.imageio.plugins.psd.PSDXMPData
class). And, there's some interesting duplication of the TIFF/Exif and IPTC metadata in the XMP. I think it's using some kind of hash/digest to test for integrity (ie. if one is changed without the other). We have to keep these in sync I think. But still very doable.
Re: GitHub attachments, yes, agree. I've asked GitHub for this functionality, and it's on their list. But they don't see it as an important enough feature I guess. Feel free to ask for it, you too.
Anyway, just rename your .tif files _tif.jpg and the quite lame GitHub attachment filter is defeated. ;-)
Harald K
Yes, I read the docs and downloaded Adobe's XMP Toolkit. My boss is hesitant to use an Adobe library because he thinks it might be a bit heavy. But I am going to take a look. Honestly - it would be nice to take a look and then add support for Writing XMP to 12 monkeys. I will supply some before and after TIFs so we can compare the metadata before and after adding the fields. And we can see if both the XMP and IPTC-NAA blocks change. A hash huh that checks that both fields are in sync? - How on earth did you figure that out? :)
Maybe I misunderstood. You are saying that there is duplication between the EXIF and the XMP.
When you say TIFF/EXIF are you referring to Photoshop Image Resource 1028?
The docs say that Photoshop stopped using IPTC-NAA with CS5 (Image Resource ID#1028). I assumed at this point they switched exclusively to XMP (ID#1060) and no longer used IPTC-NAA. But you are suggesting that both are being updated and they need to be kept in sync.
Or are you suggesting another block (Somewhere else in the EXIF not IPTC-NAA ID#1028) needs to be kept in sync with the XMP block (ID#1060)?
OK - so from what I am seeing (And I am using Photoshop SAAS so I have the latest) both #1028 and #1060 are updated by entering info in the File Info dialog in Photoshop.
I think I understand what you are saying and I am seeing it happen.
Very interesting - and stupid really. Maybe somebody should mention to those Adobe guys that they shouldn't duplicate data - ever. :) (I also just noticed how lightweight the Adobe XMP library is :))
Can't say how good or "heavy" the Adobe XMP implementation really is, however the spec is quite heavy, so implementing it completely will take some time... But we might be able to at least implement what is needed for your use case, and then build on it from there.
The above was just from the top of my head (with my daughter pulling my arm and asking 1097 times if I was ready yet while typing...), so the details may not be 100% correct, but at least it should be in principle.
The reason for the data duplication is mainly backwards compatibility, I think. If you look at the PSD format, there's a lot of data duplication. However, you can still open files from new versions in older versions of PS, and it will use the older data. This is also why we have to keep fields in sync. But yes, in many cases revolution is easier (or cheaper) than evolution. ;-)
Do you have some sample before/after files for me? If you don't like the renaming strategy, feel free to use normal email, or share via DropBox.
Harald K
Here are a before and after adding keywords...
I will have to find a few more. Tuesday I can get 5 recent before and afters.
All right!
Here's what I've found so far: The "marketing-keywords.tif" is slightly mangled (seems it didn't survive the byte order change, but that is likely Adobe's fault), did you use icafe TIFFTweaker by any chance? ;-)
Also, there's quite a bit of duplication.
TIFF/Exif:
This is the container structure of the file (fields here were changed in "marketing-keywords.tif).
Contains the following interesting tags:
305/Software: Adobe Photoshop CC (Macintosh) (ASCII)
306/DateTime: 2013:12:05 19:02:45 (ASCII)
33432/Copyright: Copyright 2009 by itemMaster.com All Rights Reserved (ASCII)
...
700/XMP
The XMP data contains some interesting fields, duplicated from the above,
plus most fields from the IPTC data. These fields seems unchanged between the files.
dc:rights: Copyright 2009 by itemMaster.com All Rights Reserved (String)**
photoshop:City: Skokie (String)*
photoshop:Country: United States of America (String)*
photoshop:Instructions: This image may not be copied, altered, reproduced by any means, or transferred without written permission of itemmaster.com (String)*
photoshop:State: IL 60077 (String)*
xap:CreateDate: 2013-12-04T08:35:17-06:00 (String)***
xap:CreatorTool: Adobe Photoshop CC (Macintosh) (String)†
xap:ModifyDate: 2013-12-05T19:02:45-06:00 (String)†
...
33723/IPTC
The IPTC fields seems unchanged between the files.
2:40/Instructions: This image may not be copied, altered, reproduced by any means, or transferred without written permission of itemmaster.com (String)
2:62/DigitalCreationDate: 20131204 (String)
2:63/DigitalCreationTime: 083517-0600 (String)
2:90/City: Skokie (String)
2:95/StateProvince: IL 60077 (String)
2:101/Country: United States of America (String)
2:116/CopyrightNotice: Copyright 2009 by itemMaster.com All Rights Reserved (String)
34377/Adobe
PSD resources. Nothing of interest, so far.
37724/ImageSourceData
This block is HUGE (2407912 bytes, 1/3 of the file),
and contains PSD data (8BIM) in TIFF container's byte order...
I haven't been able to parse it yet, as it is different from the 34377 structure,
but looking at it in a hex viewer, it seems to some contain interesting info.
) Mapped from same field in IPTC ) Mapped from same field in both IPTC and TIFF/Exif ) Mapped from the two date/time fields in IPTC †) Mapped from the corresponding field in TIFF/Exif
Oh... And the mapping between IPTC and XML seems to be standardized as well..
Regards,
Harald K
Apologies.
I had a very recent TIFF marketing image after I added 2 "Keyword" and a "Source" attribute in the File Info dialog in Photoshop, but I couldn't find the before TIFF. So I used an older before and after that I was unsure of - and quite possibly I ran TIFFTweaker on it :). In fact, I am almost positive I did something to it because Photoshop now refuses to open it.
I appreciate you spending the time to check them out - so I apologize for not giving you a better sample.
I have attached a new TIFF - Photograph was recently taken by an itemMaster Photographer and then subsequently edited in Photoshop CS5 by an Editor at itemMaster. Then I opened up the edited TIFF and added KeyWords and Source via the File Info dialog, however now I am using Photoshop 2015 - but this is the software ultimately adding the 2 File Info fields.
Keywords: 12 Monkeys, HaraldK Source: HaraldK
Hopefully this is a better starting point. I am surprised to see that we are inserting...
Copyright 2009, 2010 by itemMaster.com All Rights Reserved
Maybe I should lay off the Adobe guys...
Okay, good stuff!
I can find the source and keywords in both IPTC and XMP (IPTC source is photoshop:source in XMP, IPTC keywords is dc:subject, both as expected).
Also the TIFF DateTime tag is reflected in the XMP as xap:ModifyDate.
BTW: Here's a better document, describing the IPTC to XMP mapping.
So, to update we need to update both structures. But that should be very doable.
PS: Would be nice to have an image with updated copyright, to see if both TIFF, IPTC and XMP was modified as I expect.
Harald K
Great. I skimmed over the spec yesterday. Trying to understand how the xmp is serialized and the xmp data structures and such - the idea of padding might get a bit difficult, but I think it will be fun and a challenge to implement.
I need to track down where the copyright is coming from. It looks like it came from Photoshop so it may very well be that the editor that added the Clipping Path is using a script or macro - or however they do it in Photoshop to add the Copyright. I will speak to the Department head on Tuesday and see if they have all editors use a common script or what exactly is going on with that.
Obviously, this has little to do with this request - but now I need to check this out because it's wrong and shouldn't be happening.
Also - you'll notice that I think out loud sometimes. And I understand that we have quite a Time-Zone gap between us (I'm in Chicago, IL Central-Standard-Time). Most of the time - I'll answer my own questions if you give me a few hours. So please don't feel obligated to answer every time I post something - I know how busy I am and I'm guessing it's at least the same for you. However, that being said - I'll completely obsess over a problem until I can find the answer. With a little guidance - we can use that to our advantage here. :)
OK - Copyright should be correct now. It was my fault again.
To Review: 1.Photograph taken recently in itemMaster studio. 2.Lightroom is used to convert from camera raw to TIFF. 3.Photoshop used to edit the image (Add a Clipping Path) Photoshop CS6 4.Then I opened the edited TIFF with Photoshop 2015 and added the same Keywords and Source as last time.
I just realized that writing a new image might not work for my use-case. I realized that there is a bunch of metadata that I will lose - including the Clipping Path (unacceptable). So can we do this by updating the metadata in place? I know you said it could be done, but I also understand that this adds complexity and challenges with offsets, etc. What do you think?
We'll have to make sure we pass any metadata along to the new image, so we don't lose any information. So I don't think it will matter much if we write a new image or modify an existing. Might be easier though, to update it in place. At least the XMP can be, because it's usually padded with a lot of white space at the end that we can simply overwrite. I've also been thinking about appending new TIFF fields at the end of the file if needed, to avoid rewriting the file (just modifying some pointers). I think it should be possible.
In any case, here's a list of tasks that needs to be done (I might open separate issues for these, and keep this issue as a "parent" issue):
I'll start on the IPTC part. :-)
Harald K
That sounds great. I'll see if I can learn more about the XMP Toolkit so I can make myself useful. I'm sure we could serialize and write the XMP, but I am not against using something that works and is free to use.
So do you want to introduce a dependency on Adobe for the XMP stuff? - or would we just use the toolkit in the API that is created?
Not quite sure how you see this fitting into the 12 Monkeys library overall, seems like you could go a few ways with it.
Also, a while back I needed to resize a TIFF and resize the Clipping Path (I did it with a bit of a hack using apache's ExifRewriter). Anyway - I believe if your goal is to make the TIF metadata completely writable - the algorithm should come in handy and save you some time - it definitely works. Somehow I thought the Path would be relative and wouldn't need to change if the dimensions changed - that is not correct. Even though the Path points are expressed relative to width and height, they still need to be recalculated if the image is resized. You probably could have told me that, but I had to learn the hard way. Let me know if you think this would be useful.
Alright - I think I can/should write the XMPWriter. Unless I am grossly underestimating what needs to be done.
I need to build up a Dom representation (I'm thinking in memory because it shouldn't get too large - unless you think otherwise.) So write models for the different xmp datatypes that can independently build their own Dom representations. Then merge them and transform them into a UTF-8 String Dom document that can be written to the 700 block. Start with the common Namespaces, but make it easy to add more properties as we go. Then we/you :) can figure out how to sync them later. So I guess I'm saying that I would like to try to write the XMP XML stuff using standard APIs. Forget Adobe XMP Toolkit. Do you think that is worthwhile?
Good stuff!
I'm taking the kids to the grandparents for the weekend, so I won't be able to do anything more before Monday. Have done some necessary changes to IPTC reading, started work on IPTC write support.
It should be possible to create XML directly from the XMPDirectory
/XMPEntry
instances, without first building a DOM. But do it the way you think is best. We can always optimize and modify things later, should it not work optimally.
I think you should build a single DOM however, and serialize that to a byte[]
(or perhaps directly to a ImageOutputStream
), using UTF-8. The com.twelvemonkeys.xml.XMLSerializer
class can be used to write the DOM.
One thing I thought about long time ago, was to actually keep the DOM representation used when reading in the XMPDirectory
, and mutate on that, rather than keeping my own objects (the XMPEntry
s). At least worth looking into. That way serializing it would be super easy.
Harald K
OK - so yeah, I think I was definitely over - thinking this. I was thinking that we would have to validate all xml and make sure that it adheres to the data structures defined in the XMP spec, but after reading your response - I think this is incorrect (thinking in terms of an API) - we should accept any XML that the client code gives us. It'll simply be a chunk of XML in XMPEntry. The DOM API will handle making sure the XML is well-formed and all, but as far as validating everything and inspecting all the properties to guarantee they adhere to the spec - this is where I was going overboard, I think anyway. :) Of course there will be code to check proper namespaced property names, but I won't go much further with validating the XML itself.
If the client passes bad xml the DOM will throw an exception and the client will need to pass better parameters - not our problem.
If the client passes xml that is well-formed, but the xml itself doesn't adhere to the spec - then we are in the same boat right? The client again needs to pass better parameters - not our problem.
That being said - yes, I think I have a pretty good handle on how you wrote XMPReader.
Question... XMPScanner: Isn't it possible to skip through the TIFF stream missing the image data based on the IFDs and offsets that you encounter? I ask because it looks like XMPScanner reads every byte? Do we have to do that?
Other than that - I'm working it out. I'll keep posting as I make progress so you can intervene if you think I'm off course. Have fun with the kids this weekend. I'm actually taking all the kids up to their grandparents next weekend.
Jason
So I think my approach is wrong.
We should take in the parameters, but as a part of the parsing process - we ONLY build what we know to be a VALID XMP Fragment. Anything less gives us an XMPWriter that might be better named XMPGarbageWriter. So instead of using the XML passed in - we inspect it for supported XMP properties and build a conforming XMP Fragment.
I might be missing the obvious, but it doesn't appear that I can use XMLSerializer as is. I would need access to an OutputStream and ImageOutputStream won't help much. Or so it seems anyway. You can probably tell me what I'm missing.
So this is what I am working on so far... XMPWriter
I think this is better than allowing the client to pass garbage :)
Also thinking about a MetadataXMPSync class that can handle the different directory types from TIFFImageWriter and make sure the properties are in sync as required by the spec. What do you think?
I'm thinking too hard.
I am going to expect either a single XMPEntry with a single rdf:Description element optionally wrapped in a single x:xmpmetadata element to support short hand. The Entry will be keyed to inform the XMPWriter that it has Short Hand attributes. OR I am going to expect multiple XMPEntries each with a single XML fragment that will be able to be parsed as a child Element of rdf:Description using the recommended prefixes as stated in the docs. OR No XMP is passed. THEN I will parse as such and build an in-memory representation of the incoming XML document. I will then build an outgoing XML document complete with rdf:Description Element and defaults using the recommended prefixes - then I'll traverse the incoming XML if it has been found and merge it with the outgoing rdf:Description Element that I have built in memory. I will do some inspection to make sure the found short-hand or disparate chunks of XML are in compliance with the spec. THEN I'll wrap it as per the spec and add padding. THEN I'll write it out.
I think this will give us a decent base to start from. And as always, and my favorite part, we can improve it as needed from there.
The problem with my approach above is that it doesn't allow for the client code to pass the XML in a naturally hierarchical fashion - instead, they need to either pass 1 chunk with Short-Hand attributes or seperate chunks with a restricted prefix. It works, but it is very limiting for the client. I think this can be overcome by improving it to include the ability to pass a single XMPEntry with an entire XML Document embedded. This way the client can add their own XML with their own namespacing - including any custom or extended XMP XML - but round one won't support this.
This is where I am at (today anyway).
I don't know if this is a bug or not, but I noticed in EXIFEntry you have:
case EXIF.TAG_DIGITAL_ZOOM_RATIO:
return "DigitalZoomRation";
Is it supposed to be returning DigitalZoomRation and not DigitalZoomRatio?
Hi Jason,
Sorry for late reply. I'll try to answer (at least some of) your questions. :-)
XMPScanner
isn't of use in a TIFF file, as we know exactly where it starts after parsing the TIFF structure (the XMP tag points to it). It's just there to (possibly) allow inline editing of XMP, without having to know the container format.XMPEntries
(the XMPReader/XMPWriter
should hide all XML parsing/DOM, or perhaps leave the DOM in the XMPDirecotry/XMPEntry
). This might be what you are thinking too, but it's not completely clear to me at the moment. :-)xpacket
is "w", for writable, which I think it always is for TIFF) we should probably not add padding (but just overwrite the content, and Regards,
Harald K
Wow - I didn't realize that you are already completely parsing and reading the XMP in XMPReader
. I should have started there.
I don't see why I would have to change anything. Or why I would change anything in terms of what/how to parse XMPEntry
, XMPDirectory
, or RDFDescription
.
I wasn't paying close enough attention when you said the building blocks are all in place. I don't plan on changing a thing. :) (While I did want to rename RDFDescription to Resource - because that is really what it seems to represent.)
I was able to parse the XMPDirectory returned from XMPReader.read(stream)
and build the correct DOM. I believe that is the hardest part - so now I am just going to go over it and add some minor validation - like making sure Date's are Date's etc.. And finish it up over the next day or two.
Harald - your library kicks ass. :) It is brilliantly simple.
Hehe.. ;-)
Okay, sorry, I though you knew about those classes... But, yes, I think that really is most of what you need. It's basically the writing part that is missing. Also, I think the XMPDirectory
is immutable at the moment. We might need a separate class for mutable directories.
Good stuff anyway! Looking forward to see the result!
Harald K
Okay,
Just pushed a few changes. You want to sync as soon as possible. Sorry. ;-)
Important bits: An abstract MetadataWriter
class for your XMPWriter
to extend. And a companion abstract test case, that you should extend (not doing much at the moment, but still useful).
Also, a crude version of IPTC writing is in there, along with lots of minor changes and fixes.
Harald K
No worries. You did tell me.
I suppose at some point you could have said, "Hey knucklehead, stop trying to re-invent the wheel." :)
But I am thrilled that I spent a little time getting to know the metadata library, and the TIFF plugin.
I wrote a Servlet to replace ImageMagick (thank you for the Listener by the way - it allowed us to remove the libs under tomcat and properly register our ImageIO SPI classes.) itemMaster constantly has the need to edit/add/remove metadata and up until now, we were using ExifRewriter. Now I can remove Apache Commons Imaging from my TIFFWriter and use the metadata library along with your TIFF plugin.
Yes, Good Stuff!
Harald,
I am having a hard time figuring out where/how to record data size. It seems there is no existing facility to do this (with XML considering we are not storing a byte[] or simple type that can be counted like ExifWriter does to computeDataSize). So should I simply have the reader add an entry to the Directory recording size and then just ignore that entry when writing?
It just starts to introduce problems because ExifWriter
needs to be informed how big the 700 block will be and it cannot do that the way it normally calculates size (referring to ExifWriter
) because XMPEntry
s will only have raw values that are not directly inserted and therefore can't be calculated beforehand.
What do you think? Do I just add an entry with size in XMPReader?
The simplest thing that could possibly work (that I can think of just now), would be to serialize the entire XMPDirectory
(along with any subdirectories) using the XMPWriter
to a byte[]
(or a ByteArrayOutputStream
), then get the size from that (and throw the serialized result away).
We could later add a cache of the result or something, but as long as the serializing is stable (several invocations of XMPWriter.write(...)
will return same result), this should be safe.
I think we'll need to handle this as a special case in EXIFWriter.computeDataSize(Directory)
(or getCount(Entry)
).
The EXIFWriter
need to know the exact number of bytes up front, to minimize back and forth seeking in the stream...
Do you think this could work?
PS: I guess we'll get the same problem with the IPTC block...
Harald K
Yes - I was afraid you wouldn't like the extra overhead, but yes that should work.
OK, great.
Just something to keep in mind - this eliminates the padding idea you had with editing existing - with this model, we would never know the original size and would have to simply write it out and add padding as necessary for the next reader. But I don't see why that is a problem - as long as we are writing a new TIFF.
Anyway - just thought I'd mention that part (because you might still be seeing it differently). And as always - I'm not asking you to sign a contract here :), I know it can later change as it matures. And we figure out the best way to do it. But I'm going to go with that, especially seeing that it really doesn't require any code changes to what I have written so far - good deal.
Thanks.
Great! :-)
We probably need to support the scenario of updating XMP in-line as well, some time in the future, but until then I think this is ok (we might keep the original XMP block size as a private field on the root XMPDirectory
for that use, but I don't see much need for it right now).
Harald K
Something worth noting (thinking out loud that is...)
My version of XMPWriter utilizes 2 in-memory Maps based on the respective specs to deal with rdf:parseType="Resource"
and list items (rdf:li
).
This allows us to keep the implementation simple for the client because they won't have to worry about providing such metadata, but it is limiting in the fact that we must ignore items not found in our in-memory Map. This means any new specs or changes to existing specs would require changes to the source code to support such elements. And I've added a few namespaces to deal pretty much anything Photoshop will throw at us as it stands.
I believe it is fine because this way it leaves the client code simple. But I want to mention that this is overcome by building the aforementioned in-memory Maps based on a slightly more complex model for XMPReader. We would need to add the extra metadata for the Maps, but then if any client code wants to add a property that requires the Map metadata - the client will be forced to provide it. It wouldn't be too difficult, probably instead of...
Map<String, Object> mapval = (Map<String, Object>)entry.getValue(); //rdf:Alt List
We could have something like...
Object[] obj = (Object[])entry.getValue();
Map<String, Object> mapval = (Map<String, Object>)obj[0];
Object metadata = obj[1];
But as long as we don't mind having the Maps in memory (defined in the code itself) then we can leave the XMPReader as is and still have access to all the needed metadata required to properly build the XML in XMPWriter as well as leave it fairly simple for the client to add/set XMP properties without metadata.
Of course this can later be changed if needed.
I keep saying XMPDirectory and I have written it as such, but that is only because my hacked version makes XMPDirectory mutable.
I know it's wrong because XMPDirectory shouldn't be mutable. but you were right. Making metadata writeable with TiffImageWriter
was easy enough by doing so.
Cool.
Do you have a cloned of the repo so that I could see some code? Would be interesting to have a peak. :-)
Harald K
This really is still a work in progress.
It seems Photoshop, ExifTool, and IM identify can all read the TIFF I am producing (and display my XMP), but XMPReader actually chokes on it after writing. So I am still working on it.
Also I am still determining where I can optimize and remove redundancies. But here is the working version...(Feel free to save me some work! :))
I can't imagine editing the post above would in anyway cause you to lose the files I attached to the email, but if you didn't get all my versions of the XMP classes - let me know and I'll resend them properly.
I realized why XMPReader was choking on my written XMP - I was writing ascii nulls like stream.write('\0');
then XMPReader
chokes on it because it isn't actually an ascii NULL. I need a lot of practice with IO :)
Anyway - it appears as though everything works (writes correctly and produces a readable TIFF) when I use stream.write(' ');
for the padding.
I am hoping this is correct, but if not - we can easily fix it.
Works great - and is correctly read in Photoshop when using a Transformer
and not XMLSerializer
. I'm sure XMLSerializer can be improved, but like you said - I already have code that works for now. I will continue to test/review/improve the XMPWriter until I think it is production ready.
Awesome - I haven't taken the time to review the IPTC stuff, but it looks like it is being updated. When you get a chance - can you tell me what extra information I would need to pass in addition to running a ColorConvertOp to write a CMYK TIFF? The source TIFF would be in RGB. The resulting TIFF must have CMYK profile attached and pixels must be CMYK - so when the image is opened in Photoshop - the "Mode" will automatically be detected as CMYK.
No Hurry. Thanks.
Also - it must use LZW Compression. And I am a little worried that 12 Monkeys only supports this when writing with JPEG Compression - is it possible to write LZW Compressed TIFF in CMYK format with 12 Monkeys?
Still no hurry - I just want to arm you with information. I understand how this isn't a very popular thing to do, but I need to produce them none the less.
I was sent an example TIFF of what I need to be able to write.
Really - it seems to be working pretty good for a test run. You'll notice I added a ton of XMP fields to this TIFF. It was written with your metadata library as hacked by me and the new XMPWriter class.
It looks good - No?
Also I keep forgetting to mention that the spec actually says that a reader should ignore or remove any tiff:NativeDigest
or exif:NativeDigest
elements found. The spec stills says that the application should make sure the fields match across metadata, but it is now left up to the implementation how to sync the data itself - a hash is no longer recommended to be used.
"Existing TIFF and Exif digests found in the XMP (tiff:NativeDigest and exif:NativeDigest) should be ignored when reading and removed when updating."
So I guess this means if no client XMP is passed then we should rewrite the Element, but if the XMP is edited, then we should remove it altogether. But in either case - we are no longer required to use it to check if the EXIF is in sync.
Hi Jason,
Great stuff! I haven't had much time in front of the computer last week, so not been able to follow up as I have wanted to, but it seems you've been busy and done a whole lot! I'll try to test it later today (I still have the classes in the email, so that's okay).
Wasn't aware that the native digest fields should no longer be used. But that is actually a good thing, as it only complicates things. :-)
Regarding the padding, I believe a space (ASCII 32/0x20) is the correct pad character. ASCII null (0x0) is usually used as an string terminator, so some software might just stop parsing there. But if the spec says (or Photoshop does) otherwise, we could change that.
CMYK TIFFs: I haven't tested this especially, so there might be bugs... I suggest you file a new issue for this, if it doesn't work, and attach the code you used. AFAIK: What needs to be set to be a valid CMYK TIFF (see TIFF spec, pages 69+) , is SamplesPerPixel == 4
, BitsPerSample == [8,8,8,8]
and PhotometricInterpretation == 5
. Anything else is optional, like InkSet
(which I read as default to CMYK or 1
), NumberOfInks
and InkNames
etc.
LZW compression: A single image in a TIFF file can only have a one compression (ie. JPEG and LZW is mutually exclusive). So I don't really understand what you mean here... Writing LZW should work though. If you can't make it work, please open a separate issue, and attach the code you used.
Regards,
Harald K
PS: I just tested writing a CMYK TIFF with LZW compression, and well... The file opens using the TwelveMonkeys TIFFImageReader
, however there's a bug. It does not open in Preview, and I guess also not in Photoshop. This is due to the ExtraSamples
tag, that is incorrectly added. See #146, feel free to update the issue if you have more details.
If you change the line:
if (numComponents > 3) {
to:
if (numComponents > colorModel.getNumColorComponents()) {
It works in Preview at least.
Harald K
Harald - I am missing something simple here - because I couldn't even get that far. Starting with a BufferedImage in sRGB what steps do you take?
ColorConvertOp then write it, or did you have to give the Writer more information? Because everytime I convert the ColorSpace and then write it - the image passed to the Writer is not interpreted as being in CMYK Colorspace and the Writer always writes an sRGB TIFF.
What am I missing?
I am thinking my error is in the way I instantiate the BufferedImage for the ColorConvertOp. After I changed the numComponents - I am getting a CMYK image in Photoshop, but it is all White, same in Preview. I am pretty sure my problem is that I am using the wrong Constructor for the BufferedImage that I pass into the ColorConvertOp (width,height,type) - pretty sure this is the wrong one to use when you have a CMYK Colorspace. Still trying...
XMPWriter is writing really well for me, but I am interested in how your testing goes - I have a feeling you have a little more practice breaking code than I do. :)
No worries about your response time - I appreciate your help. I can't get answers like you can give me anywhere else. If patience is the only price - it's well worth it.
Thanks.
It's probably the color conversion then, I used a cmyk tiff as a starting point. But show me the code, and I tell you for sure.
Harald K
Sendt fra min iPhone
Den 16. jun. 2015 kl. 13.50 skrev jasonwpalmer notifications@github.com:
Harald - I am missing something simple here - because I couldn't even get that far. Starting with a BufferedImage in sRGB what steps do you take?
ColorConvertOp then write it, or did you have to give the Writer more information? Because everytime I convert the ColorSpace and then write it - the image passed to the Writer is not interpreted as being in CMYK Colorspace and the Writer always writes an sRGB TIFF.
What am I missing?
— Reply to this email directly or view it on GitHub.
Type is wrong I think.
BufferedImage output = new BufferedImage(input.getWidth(), input.getHeight(), BufferedImage.TYPE_4BYTE_ABGR);
ICC_ColorSpace cmykSpace = new ICC_ColorSpace(ICC_Profile.getInstance("/u/WebCoatedFOGRA28.icc"));
ColorConvertOp cco = new ColorConvertOp(input.getColorModel().getColorSpace(), cmykSpace, null);
output = cco.filter(input, output);
I experimented using several different types - but this is my problem isn't it? The type for CMYK is really BufferedImage.TYPE_CUSTOM, but I can't use that with this constructor and like an idiot I just keep trying to use other arbitrary types, somehow hoping the ColorConvertOp will figure it out.
It's working. Perfect.
It is read in Photoshop and renders appropriately as CMYK LZW TIFF.
static BufferedImage createCMYKBufferedImage(double l_width, double l_height, ColorSpace cmykColorSpace) {
ComponentColorModel l_ccm = new ComponentColorModel(cmykColorSpace, false, false,
1, DataBuffer.TYPE_BYTE);
int[] l_bandoff = {0, 1, 2, 3}; //Index for each color (C, is index 0, etc)
PixelInterleavedSampleModel l_sm = new PixelInterleavedSampleModel(
DataBuffer.TYPE_BYTE,
(int)l_width, (int)l_height,
4,(int)l_width*4, l_bandoff);
WritableRaster l_raster = WritableRaster.createWritableRaster(l_sm,
new Point(0, 0));
return new BufferedImage(l_ccm, l_raster, false, null);
I passed the above BufferedImage
to ColorConvertOp
and along with your earlier change and it wrote the TIFF that I was looking for.
That is killer. (So how long before you fully implement Tiff 6.0? - joking, that might take a while.)
Re ColorConvertOp
, yes, the problem is the type. If you do a color conversion to a type that is RGB, it will still be RGB afterwards (you can't change the type or ColorModel/ColorSpace
of a BufferedImage
after creation). I think you can just pass null
as the second argument to the filter
method, and an appropriate BufferedImage
will be created for you (based on the output color space). :-)
But your method above works too! And has the benefit of you having full control over data type, number of components etc.
What features of TIFF 6.0 are you missing? ;-)
Completely implementing TIFF 6.0 isn't possible I guess, as the spec is open ended. But I think I have most of the baseline features covered, and also the most common extensions.
(sorry, accidentally hit the "Comment" button, before being done...)
Regards,
Harald K
Realizing that I was using the wrong type and constructor was bad enough.
But physically passing a null destination image and watching ColorConvertOp figure it out - is a bit hard to take. I guess you live and you learn. I won't soon forget this lesson.
And mentioning the 6.0 TIFF spec was me trying to say that it is amazing how much of it 12Monkeys has already implemented (and also knowing that a full implementation is next to, if not - impossible). At the moment - there is absolutely nothing I need to do with a TIFF that can't be done with the metadata library and your TIFF plugin. Good stuff.
Haha..
I guess we all have had our moments, I can remember several times I've just completed and checked in this great piece of code that I was really satisfied with, and a colleague comes over and says "Hey, this is great stuff, but why didn't you just use API X or the foo method..?" It's tough, but I guess it's good for learning... ;-)
Re. TIFF: Thanks then! :-) I'll continue adding more stuff when needed. The most important parts right now is CCITT FAX en-/decoding and the TIFF metadata (of which you have contributed a great deal!).
Harald K
CCITT FAX looks like it's right up your alley. You must be happy about Java 8 Stream API and Lambdas. But I guess it also presents problems for 12 Monkeys and making Java 8 a requirement.
I have written 2 functions that I have been using with XMPWriter to do 2 different things.
1.Override's XMP Elements (Entry) from first Directory with XMP Elements found in 2nd Directory.
static Directory override(Directory readDir, Directory overDir)
2.Syncs (and adds defaults) based on the metadata found in TIFF tags. You always get back a Directory with XMP Metadata that coincides with both Exif and IPTC found in the topLevelTiffTags.
static Directory digestSyncs(List<Entry> topLevelTiffTags, Directory xmp)
Maybe an XMPMetadata utility class or something? It's not finished, but it is working.
You tell me if you want to take a look at it - you might find it useful and worth including.
Yes, please, send me the code if you allow me to use it in the library. I guess it makes sense to include it as part of the metadata packages.
Java 8 is great! However, as a library should be useful to as many people as possible, I need to be a little conservative about adopting new features. Hey, some people went screaming after I upgraded the requirement from Java 5 to Java 7 (after Java 7 was already in EOL...). ;-)
Harald K
Hi Jason!
Just checked in some last minute changes before I went on holidays. There's now read-only TIFF metadata from the TIFFImageReader. Will implement to format for the writer as well, plus make the metadata read/write. The internal representation will be the EXIFDirectory/IFD.
Hopefully it will integrate nicely with your changes, and be useful. :-)
Regards,
Harald K
Just now checked in writable TIFFImageMetadata
along with some support in the TIFFImageWriter
on #139 . Not really there yet, but we're getting there. :-)
Again, thanks for your contributions (and patience...)!
Harald K
So if you open a TIFF image in Photoshop. Then open the File Info dialog - you get a panel displaying the File Info associated with the TIFF image. Supposedly, all of this information is derived from the IPTC-NAA Photoshop IRB. If you click Raw Data you see the xmp xml document. I am assuming this document is what composes the IPTC-NAA Photoshop IRB.
Am I correct?
The goal for this request is to be able to read in a TIF source file and then supply some added metadata to be embedded within the IPTC-NAA block and then write a new TIF image with the added metadata.
Again, I would attach a TIF to open in Photoshop, but it still looks like GitHub only supports PNG, GIF, or JPG - crazy huh?