drewnoakes / metadata-extractor

Extracts Exif, IPTC, XMP, ICC and other metadata from image, video and audio files
Apache License 2.0
2.57k stars 481 forks source link

Cast exception in processing a tag #116

Closed bsmith42 closed 8 years ago

bsmith42 commented 9 years ago

A long[] is used to hold the unsigned int then seObjectArray() is called which in all the cases it handles the array must be an int[].

java.lang.ClassCastException: long[] cannot be cast to int[]
    at com.drew.metadata.exif.makernotes.CanonMakernoteDirectory.setObjectArray(CanonMakernoteDirectory.java:692)
    at com.drew.metadata.tiff.DirectoryTiffHandler.setInt32uArray(DirectoryTiffHandler.java:190)
    at com.drew.imaging.tiff.TiffReader.a(TiffReader.java:348)
    at com.drew.imaging.tiff.TiffReader.processIfd(TiffReader.java:205)
drewnoakes commented 9 years ago

Thanks for the report. Do you have an image you can share that reproduces this?

bsmith42 commented 9 years ago

I don't unfortunately since it is a auto-reported crash. It looks like it is incorrectly determining the maker note shot info which is a set of either int16u or int16s types. I will try to get some more info on the photo being processed.

However, the code path from int32u to setObjectArray() doesn't support this type of call. This would at least need some checks to validate the array before use.

drewnoakes commented 9 years ago

The code should definitely be more defensive and avoid the exception. Would be great to have an image to repro this.

niklasha commented 8 years ago

I can supply an image causing similar behaviour on at least version 2.8.1 of metadata-extractor.

huge base64 encoded string removed

niklasha commented 8 years ago

Hmm that did not come out very pretty, but I guess you can make use of it anyhow :-)

drewnoakes commented 8 years ago

Thanks for the image. Here is the decoded form.

niklasha commented 8 years ago

Perhaps I should ask, are you looking into this issue? I can dive into it myself but would not want to waste doubled efforts. It is however important enough for me to work on if noone else does.

drewnoakes commented 8 years ago

By all means please do take a look. I'm swamped at the moment. A PR would be great, out even just a patch on this issue. I suspect the problem is a gap in the coercion of types in Directory, but I may well be wrong.

On Tue, 8 Mar 2016 10:40 Niklas Hallqvist, notifications@github.com wrote:

Perhaps I should ask, are you looking into this issue? I can dive into it myself but would not want to waste doubled efforts. It is however important enough for me to work on if noone else does.

— Reply to this email directly or view it on GitHub https://github.com/drewnoakes/metadata-extractor/issues/116#issuecomment-193716613 .

bsmith42 commented 8 years ago

I did a change to this but wanted to investigate more if it was the right solution. Basically, some Canon cameras have a different type so I added a second type to the processing.

--- a/Source/com/drew/metadata/exif/makernotes/CanonMakernoteDirectory.java
+++ b/Source/com/drew/metadata/exif/makernotes/CanonMakernoteDirectory.java
@@ -677,9 +677,15 @@ public class CanonMakernoteDirectory extends Directory
         // Otherwise just add as usual.
         switch (tagType) {
             case TAG_CAMERA_SETTINGS_ARRAY: {
-                int[] ints = (int[])array;
-                for (int i = 0; i < ints.length; i++)
-                    setInt(CameraSettings.OFFSET + i, ints[i]);
+                if (array instanceof short[]) {
+                    short[] shorts = (short[])array;
+                    for (int i = 0; i < shorts.length; i++)
+                        setInt(CameraSettings.OFFSET + i, (int)shorts[i]);
+                } else {
+                    int[] ints = (int[])array;
+                    for (int i = 0; i < ints.length; i++)
+                        setInt(CameraSettings.OFFSET + i, ints[i]);
+                }
                 break;
             }
             case TAG_FOCAL_LENGTH_ARRAY: {
@@ -689,9 +695,15 @@ public class CanonMakernoteDirectory extends Directory
                 break;
             }
             case TAG_SHOT_INFO_ARRAY: {
-                int[] ints = (int[])array;
-                for (int i = 0; i < ints.length; i++)
-                    setInt(ShotInfo.OFFSET + i, ints[i]);
+                if (array instanceof long[]) {
+                    long[] longs = (long[])array;
+                    for (int i = 0; i < longs.length; i++)
+                        setLong(ShotInfo.OFFSET + i, longs[i]);
+                } else {
+                    int[] ints = (int[])array;
+                    for (int i = 0; i < ints.length; i++)
+                        setInt(ShotInfo.OFFSET + i, ints[i]);
+                }
                 break;
             }
bsmith42 commented 8 years ago

If you think it is the right way to do it, I can submit it.

niklasha commented 8 years ago

In my case I think it's a byte array though. But I will look into it today. BTW, your patch got badly indented when pasted here.

kwhopper commented 8 years ago

Testing with other imaging programs (Irfanview, ExifTool, etc) seems to indicate this tag is always considered int16u (ushort) and these programs don't try to process the notes. It likely means these tags or the makernote in general is written incorrectly. Maybe a better way is to patch it by simple type checking and not try to process it any further. Something like:

case TAG_FOCAL_LENGTH_ARRAY: {
    if (array instanceof short[]) {
                short[] shorts = (short[])array;
                for (int i = 0; i < shorts.length; i++)
                    setInt(FocalLength.OFFSET + i, (int)shorts[i]);
    }
    break;

... applied to any of the tags that fail to cast. "short" is probably being automagically cast to "int" anyway, so I wonder if using "int" is appropriate in this makernote directory.

sqlboy commented 8 years ago

here is another image with the issue. https://farm8.staticflickr.com/7084/13819169694_a14574242d_o.jpg

drewnoakes commented 8 years ago

@kwhopper was right. The process of dividing the array up into 'fake' tags only makes sense if we receive int16[]. Otherwise we should just call the base implementation. As this applies to many case labels, I put the guard at the beginning of the function.

tecywiz121 commented 8 years ago

Assuming there isn't one yet, what are the chances of getting a release with this fix in?