dragon66 / icafe

Java library for reading, writing, converting and manipulating images and metadata
Eclipse Public License 1.0
204 stars 58 forks source link

Can't delete anything from IPTC metadata and preserve Clipping Path. #40

Closed sinedsem closed 7 years ago

sinedsem commented 7 years ago

Some people insert outlines in their JPEG images, which are named "Clipping path". There are a few articles about this, e. g. http://designus.sk/the-secret-path-of-jpg-images/

Here I have an example of image with clipping path: https://github.com/sinedsem/test/blob/master/clipping_path.jpg

In addition to clipping path, this image has one IPTC Keywords tag - keyword. Let's say I want to remove this tag from image, but save clipping path info. Unfortunately, I can't do this.

When I call insertIPTC with update = true, clipping path is saved, but I can't delete tags, only add. When I call with update = false, I see no way to preverve clipping path.

That's how I use it:

File source = new File("clipping_path.jpg");
File destination = new File("result.jpg");
FileInputStream is = new FileInputStream(source);

Map<MetadataType, Metadata> metadataMap = Metadata.readMetadata(is);

ArrayList<IPTCDataSet> iptcDataSets = new ArrayList<>();

// the following loop is needed to preserve all metadata but Keywords. Unfortunately, I can't retrieve clipping path from here.
if (metadataMap != null && metadataMap.get(MetadataType.IPTC) != null) {
    IPTC iptc = (IPTC) metadataMap.get(MetadataType.IPTC);
    for (Map.Entry<String, List<IPTCDataSet>> entry : iptc.getDataSets().entrySet()) {
        if (!entry.getKey().equals(IPTCApplicationTag.KEY_WORDS.getName())) {
            System.out.println(entry.getKey());
            iptcDataSets.addAll(entry.getValue());
        }
    }
}

is = new FileInputStream(source);
FileOutputStream os = new FileOutputStream(destination);

Metadata.insertIPTC(is, os, iptcDataSets, true); // this can be true or false

is.close();
os.close();
sinedsem commented 7 years ago

As I see from text editor, a significant part of metadata is being cut off when calling insertIPTC(.., .., .., false), which is defenitely not an IPTC

sinedsem commented 7 years ago

Very similar issue on SO for Objective-C, with no provided solution =( http://stackoverflow.com/questions/32160528/read-non-standard-properties-from-an-image-in-objective-c

sinedsem commented 7 years ago

This article https://www.graphicsmill.com/docs/gm/clipping-path.htm says that

Clipping paths are stored in Adobe image resource blocks in TIFF, JPEG, and PSD image formats.

so I don't think that insertIPTC should remove them.

sinedsem commented 7 years ago

I tried going a little more deeper, and that's what I've found:

File source = new File("clipping_path.jpg");
File destination = new File("result.jpg");
FileInputStream is = new FileInputStream(source);

Map<MetadataType, Metadata> metadataMap = Metadata.readMetadata(is);
System.out.println(metadataMap.get(MetadataType.PHOTOSHOP_IRB)); // here we have PHOTOSHOP_IRB, where, as I hope, clipping path is stored.

ArrayList<IPTCDataSet> iptcDataSets = new ArrayList<>();
// loop from previous listing doesn't matter, Exception doesn't depend on it

is = new FileInputStream(source);
FileOutputStream os = new FileOutputStream(destination);

Metadata.insertIPTC(is, os, iptcDataSets, false); // false, just insert empty IPTC

is.close();
os.close();

metadataMap = Metadata.readMetadata(destination); // IllegalArgumentException !!!
System.out.println(metadataMap.get(MetadataType.PHOTOSHOP_IRB));

Stack Trace:

java.lang.IllegalArgumentException: Copy range out of array bounds
at com.icafe4j.util.ArrayUtils.subArray(Unknown Source)
at com.icafe4j.image.meta.adobe.IRB.read(Unknown Source)
at com.icafe4j.image.meta.Metadata.ensureDataRead(Unknown Source)
at com.icafe4j.image.meta.adobe.IRB.get8BIM(Unknown Source)
at com.icafe4j.image.jpeg.JPEGTweaker.readMetadata(Unknown Source)
at com.icafe4j.image.meta.Metadata.readMetadata(Unknown Source)
at com.icafe4j.image.meta.Metadata.readMetadata(Unknown Source)

@dragon66, will you have a look into this issue? I would be very grateful if you can solve this. BTW, can I buy you a beer? you have already fixed 3 issues that I've reported.

dragon66 commented 7 years ago

@sinedsem When you call insertIPTC with update = true, it basically says keep all the other stuff of the IPTC 8BIM in Photoshop IRB and insert new IPTC. Otherwise it will replace the whole IPTC 8BIM with only the IPTC inserted. That is what you have seen with your test and that is exactly what icafe is supposed to do.

I am not sure what do you mean by saying "I can't delete tags, only add." What tags are you trying to delete? You want to delete tags for IPTC and failed?

Also if you want to remove IPTC or other metadata, you can call

Metadata.removeMetadata(fin, fout, MetadataType.IPTC)

which takes variable number of arguments. More information can be found from TestMetadata.java. I used this call to remove the IPTC in the original image and it works fine.

sinedsem commented 7 years ago

@dragon66 Yes! Thank you very much for the answer!

What I am trying to do is to delete all IPTCApplicationTag.KEY_WORDS entries from my image and then insert another IPTCApplicationTag.KEY_WORDS entries, but preserve all other information that was in image.

The way you've described works perfectly: firstly call removeMetadata(.., .., MetadataType.IPTC) to get rid of all previous IPTC, then Metadata.insertIPTC(.., .., .., true) to insert new IPTC.

But still, there are a few unclear points for me. Did I get it right from your words that IPTC is a part of IRB? This would explain your first sentence. And this also would explain, why I can't delete IPTCApplicationTag.KEY_WORDS that are already in image just by calling Metadata.insertIPTC(.., .., Collections.emptyList(), true) without previuosly calling removeMetadata(.., .., MetadataType.IPTC)

I have another question for you. My application allows to edit IPTC and XMP metadata. Is there a way to achieve this without copying the whole image 3 times in a row? I mean, now I need to:

File stepOneTempFile = File.createTempFile(TEMP_FILE_PREFIX, ".jpg");
File stepTwoTempFile = File.createTempFile(TEMP_FILE_PREFIX, ".jpg");
File stepThreeTempFile = File.createTempFile(TEMP_FILE_PREFIX, ".jpg");

FileInputStream is = new FileInputStream(sourceFile);
FileOutputStream os = new FileOutputStream(stepOneTempFile);
Metadata.removeMetadata(is, os, MetadataType.IPTC);
is.close();
os.close();

// creating iptcDataSets
is = new FileInputStream(stepOneTempFile);
os = new FileOutputStream(stepTwoTempFile);
Metadata.insertIPTC(is, os, iptcDataSets, true);
is.close();
os.close();

// creating xmp
is = new FileInputStream(stepTwoTempFile);
os = new FileOutputStream(stepThreeTempFile);
Metadata.insertXMP(is, os, xmp);
is.close();
os.close();

FileUtils.copyFile(stepThreeTempFile, sourceFile);

stepOneTempFile.deleteOnExit();
stepTwoTempFile.deleteOnExit();
stepThreeTempFile.deleteOnExit();
dragon66 commented 7 years ago

@sinedsem To answer your questions:

  1. Yes, IPTC is one of the 8BIM block. All the 8BIM blocks together become Photoshop IRB (Image Resource Block) metadata. It is actually called IPTC_NAA ( You can find an entry from ImageResourceID.java for it called IPTC_NAA("IPTC-NAA record.", (short)0x0404)). In the testing image you provided, the Working Path 8BIM block is right before the IPTC_NAA 8BIM block.

  2. The reason why you can't delete that keyword by inserting a new IPTC keyword is actually due to the fact that IPTC allows multiple keywords to be inserted. Because of that, in my implementation, icafe will not override the keyword tag value instead it would create a list for all the keyword values including the old one. In IPTCApplicationTag.java enum, you will be able to find a method called allowMultiple() which controls whether or not the corresponding tag allows multiple.

  3. This is a bit complicated. Manipulating metadata is not as easy as removing metadata which can be done in a single swoop without messing up the whole structure. Not that it is not doable but that it would be memory consuming as we have to put the whole image structure tree into the memory. Another challenge is we have to maintain certain sequence for different metadata types along with the other parts of the image.

Given the above reason, currently, there is no easy way for icafe to manipulate more than one metadata types at a time but you actually only need two steps instead of three to add both IPTC and XMP. See my later comments.

sinedsem commented 7 years ago

Thanks, I am completely fine with your answers.

dragon66 commented 7 years ago

@sinedsem By the way, thanks for offering a beer! That's exactly what I need sometimes to keep icafe rolling given the workload I have day to day.

sinedsem commented 7 years ago

@dragon66 I insist you answer my email sent today..

dragon66 commented 7 years ago

@sinedsem I actually found that you can use two steps instead of three steps to add both IPTC and XMP. Using the following step to replace the two steps - removing and inserting IPTC:

Metadata.insertIRB(fin, fout, createPhotoshopIPTC(), true);

Here is the createPhotoshopIPTC():

private static List<_8BIM> createPhotoshopIPTC() {
    IPTC_NAA iptc = new IPTC_NAA();
    iptc.addDataSet(new IPTCDataSet(IPTCApplicationTag.COPYRIGHT_NOTICE, "Copyright 2014-2016, yuwen_66@yahoo.com"));
    iptc.addDataSet(new IPTCDataSet(IPTCApplicationTag.KEY_WORDS, "Welcome 'icafe' user!"));
    iptc.addDataSet(new IPTCDataSet(IPTCApplicationTag.CATEGORY, "ICAFE"));

    return new ArrayList<_8BIM>(Arrays.asList(iptc));
}
dragon66 commented 7 years ago

@sinedsem, I found there is a bug with insertIPTC() with update=false. It actually removes all the other IRB data including Working Path. I have checked in a new bug fix for this issue - Fix bug of insertIPTC() with update=false not keeping other _8BIMs. Now, you should be able to use insertIPTC with update = false to remove all the original IPTC and insert new IPTC at the same time. So only two steps are needed if you want to insert both IPTC and XMP.

sinedsem commented 7 years ago

@dragon66 thanks a lot, I did as you suggested. In addition, I don't save to file between two steps (iptc and xmp), I save to byte array (using ByteArrayInputStream and ByteArrayOutputStream), this makes all even faster!

dragon66 commented 7 years ago

@sinedsem I added support to insert metadata to JPEG image in one step. You can check the TestMetadata.java for more details.