drewnoakes / metadata-extractor-dotnet

Extracts Exif, IPTC, XMP, ICC and other metadata from image, video and audio files
Other
939 stars 167 forks source link

Changes in the behavior of TryGetInt32 #279

Closed oleksant closed 3 years ago

oleksant commented 3 years ago

Hello,

We have the following piece of code:

if (directory.TryGetInt32(tag.Type, out var intValue))
{
     return intValue;
}

Starting from version 2.0.0 its behavior has been changed. Having the "GPS Latitude Ref" (Tag Type = 1) tag we were getting false while trying to get the value. Now we are getting true and we cannot use our generic approach in this case.

Can you give a piece of advice, if there any better way to understand if this is actually int or not?

drewnoakes commented 3 years ago

Can you provide more context? What value for that tag causes the issue?

oleksant commented 3 years ago

@drewnoakes, thanks much for such a quick response So basically we used to go through all directories one by one with the following process:

private static object GetTagValue(Directory directory, Tag tag)
        { 
            if (directory is GpsDirectory)
            {
                var dataArray = directory.GetRationalArray(tag.Type);
                if (dataArray != null && dataArray.Length == 3)
                {
                    if (tag.Type == GpsDirectory.TagTimeStamp)
                    {
                        return new TimeSpan(dataArray[0].ToInt32(), dataArray[1].ToInt32(), dataArray[2].ToInt32());
                    }
                    return GeoLocation.DegreesMinutesSecondsToDecimal(dataArray[0], dataArray[1], dataArray[2], false);
                }
            }

            if (string.IsNullOrWhiteSpace(tag.Description))
                return null;

            if (directory.TryGetRational(tag.Type, out var rationalValue))
            {
                return rationalValue.IsInteger ? rationalValue.ToInt32() : rationalValue.ToDouble();
            }

            if (directory.TryGetInt32(tag.Type, out var intValue))
            {
                return intValue;
            }

            if (directory.TryGetDateTime(tag.Type, out var dateTime))
            {
                return dateTime;
            }
            var data = directory.GetByteArray(tag.Type);
            if (data != null && Encoding.ASCII.GetString(data).StartsWith("UNICODE"))
            {
                return Encoding.BigEndianUnicode.GetString(data, SizeOfChar, data.Length - SizeOfChar);
            }

            return tag.Description;
        }

While debugging I can see that the value is "N" and it is stored as an array with the first and the only element being "N". As the result, we have 78, so the N is simply converted to num. image

Edited: Previously we were skipping TryGetInt32, getting false and the tag.Description was the output from method

drewnoakes commented 3 years ago

The Try methods will attempt to coerce between internal representations. This coercion logic has apparently changed between library versions, and may change again in future.

It seems to me that you want logic more like this (untested and may not even compile but should give you the idea):

private static object GetTagValue(Directory directory, Tag tag)
{ 
    if (directory is GpsDirectory)
    {
        var dataArray = directory.GetRationalArray(tag.Type);
        if (dataArray != null && dataArray.Length == 3)
        {
            if (tag.Type == GpsDirectory.TagTimeStamp)
            {
                return new TimeSpan(dataArray[0].ToInt32(), dataArray[1].ToInt32(), dataArray[2].ToInt32());
            }
            return GeoLocation.DegreesMinutesSecondsToDecimal(dataArray[0], dataArray[1], dataArray[2], false);
        }
    }

    if (string.IsNullOrWhiteSpace(tag.Description))
        return null;

    object o = directory.GetObject(tag.Type);

    if (o is Rational rationalValue)
    {
        return rationalValue.IsInteger ? rationalValue.ToInt32() : rationalValue.ToDouble();
    }

    if (o is int intValue)
    {
        return intValue;
    }

    if (o is DateTime dateTime)
    {
        return dateTime;
    }

    if (o is byte[] data && Encoding.ASCII.GetString(data).StartsWith("UNICODE"))
    {
        return Encoding.BigEndianUnicode.GetString(data, SizeOfChar, data.Length - SizeOfChar);
    }

    return tag.Description;
}
oleksant commented 3 years ago

Thanks much, it gave me the direction. So it looks like there are plenty of cases when we skip

 if (o is int intValue)
    {
        return intValue;
    }

because the value is UInt, byte, etc.

Also, we almost never enter the following code

 if (o is DateTime dateTime)
    {
        return dateTime;
    }

So, in my case, I managed to use the directory.TryGetDateTime(tag.Type, out var DateTime). Since it catches only FormatException I had to make the code skipping any error from this method and not failing on convertation errors.

drewnoakes commented 3 years ago

Makes sense. The library tries to preserve the underlying type.

I'm curious why you don't just use the descriptor's output for each tag. Can you explain your scenario a bit?

oleksant commented 3 years ago

Sorry for such a long response, missing all the notifications

So for our need, we are trying to get as much useful information. We are not looking for any specific data but collecting everything to be able to search through.

While reading data we also want to normalize some of the data, like DateTime or rational values. Basically, that is the whole scenario.

drewnoakes commented 3 years ago

I suggest you use the library's own tag descriptors. Many of the values you are currently reading out will have no context. For example, the number 3 might actually decode to the full name and model of a camera lens. With your approach, you'll just display 3. Other values will not have units, eg. focal length (50mm) or aperture (f-2.8).

If there's a tag for which the descriptor is inadequate, either special-case just that tag, or submit a PR to this library to get it improved.

oleksant commented 3 years ago

So the tag.Description should contains the most useful information?

drewnoakes commented 3 years ago

Yes, that is the point of it. It applies tag-specific logic to produce a string that's useful for presentation in a UI.

oleksant commented 3 years ago

Okay, thanks much for your support!!