stark-t / PAI

Pollination_Artificial_Intelligence

5 stars 1 forks source link

Image orientation confusion #24

Closed valentinitnelav closed 2 years ago

valentinitnelav commented 2 years ago

I thought I solved this issue, but to my surprise, I still got trapped in the confusion of image orientation. There is also an open issue for YOLO here: Exif Image Orientation. Currently, the function exif_size was moved to utils/dataloaders.py.

I am worried that part of the errors in IoU could be due to the flip in image width with height. YOLO attempts to flip them as indicated above, when the orientation is 90 or 270 degrees. I also tried to do the same when I built the training dataset with bounding boxes, but today I got surprised that when I run some predictions, some images with 90 or 270 dg orientation were ok, but one had a flipped box and was unclear why despite trying to take care of the orientation.

So far, what I know is that the annotation tool that we use, VIA VGG can read the orientation of the image and displays it according to the orientation information from the metadata - see example here. Also, it exports the image statics as it displays it on the screen. That means that exported image statics should match how YOLO processes them by flipping them. Edit: VIA VGG "relies on the browser to load the image, according to HTML specification" - see example & discussion here. This will not always correspond to how PIL reads image width and height - see below in this thread.

I think during our evaluation process, we should also extract the image statics (width, height, and orientation info) and compare that with the metadata that the VIA VGG gave when we did the annotation. I already have that but I didn't check if this matches how YOLO reads it. I will try to look into this issue.

valentinitnelav commented 2 years ago

EDIT

This is not a bug, I was not aware of the global variables - see https://github.com/stark-t/PAI/issues/24#issuecomment-1261177484

This is obsolete now:

@stark-t , I try to debug and understand the behavior of exif_size and I run into a problem and I need your help to double-check this.

The function exif_size is defined like this in utils/dataloaders.py:

def exif_size(img):
    # Returns exif-corrected PIL size
    s = img.size  # (width, height)
    try:
        rotation = dict(img._getexif().items())[orientation]
        if rotation in [6, 8]:  # rotation 270 or 90
            s = (s[1], s[0])
    except Exception:
        pass

    return s

It is used in verify_image_label() function in the same script like this:

im = Image.open(im_file)
im.verify()  # PIL verify
shape = exif_size(im)  # image size

But when I test it line by line with an example, the part rotation = dict(img._getexif().items())[orientation] fails with the error NameError: name 'orientation' is not defined. While that part is wrapped inside a try call, then the function never actually runs it because always gives an error and try skips it, or do I miss something obvious? This seems to me like a bug on YOLO's side.

Here is what I mean:

First, download this image on your computer, then run exif_size and get separately the image size and you get the same results despite the fact that the image is indeed rotated - the exif metadata contains "Rotate 270 CW" as posted here.

import urllib.request
import PIL

urllib.request.urlretrieve("https://observation.org/photos/2075125.jpg", "Diptera_Anthomyiidae_Delia_lamelliseta_2075125.jpg")

img_path = 'Diptera_Anthomyiidae_Delia_lamelliseta_2075125.jpg'
img = PIL.Image.open(img_path) # you have to open an image like this
exif_size(img) # first load the function from above; this gives (448, 336) It is (width, height) as a tuple
img.size # this also gives (448, 336), So, no rotation has happened despite the image having the orientation 'Rotate 270 CW"

Further, the dict(img._getexif().items())[orientation] part of exif_size(img) gives an error if run directly like this:

dict(img._getexif().items())[orientation]
# NameError: name 'orientation' is not defined

One has to get the EXIF metadata as a dictionary maybe like suggested here:

import PIL.ExifTags
exif = {
    PIL.ExifTags.TAGS[k]: v
    for k, v in img._getexif().items()
    if k in PIL.ExifTags.TAGS
}
exif['Orientation'] # only now we can index by the key 'Orientation'
# 8

This looks like a bug in exif_size, because it doesn't access the EXIF metadata properly. But before reporting this on the YOLO repo, could you double check it as well?

valentinitnelav commented 2 years ago

I just checked also how PIL returns the size of an image.

The .size attribute documentation says that "Image size, in pixels. The size is given as a 2-tuple (width, height)."

For the case above it returns (448, 336), which seems already flipped based on EXIF metadata, so that adds further to my confusion. the lesson here is that each program might do their own flipping when they report image statics and I need to pay extra care...this is getting crazy

valentinitnelav commented 2 years ago

And an extra confusion - in this commet the author of YOLOv5, Glenn Jocher, says that "train.py and test.py attempt to extract an exif-corrected size during label caching." And we saw that the proposed exif_size function actually doesn't do any flipping of img width and height.

Then he adds that "detect.py dataloader does not consider exif." This means that when exif_size was implemented, it was not targeting detect.py, but only train.py and test.py? And why not also for detect.py, so that at least there is consistency?

Anyways, right now I will check all the data for P1 to see how much of it was affected by this and hope to get a more clear understanding of how to proceed.

valentinitnelav commented 2 years ago

So, I checked each image file in the P1 dataset and found only 23 files that are potentially affected by the flip in image statics (width and height) done automatically by the VIA annotator. This is a small drop in the sea and is reassuring for the P1 dataset. I need to check the field images dataset as well.

In these 23 cases, I read the image statics using PIL and compared the values with those that the VIA annotation tool exported when I prepared the YOLO annotation files. If YOLO indeed has a bug and doesn't flip the width and height when they have orientation code 6 (Rotate 90 CW) or 8 (Rotate 270 CW), then these 23 images have wrong bounding box coordinates because VIA flipped the image statics and YOLO not. EDIT: There is no bug on YOLO's side as initially thought - see https://github.com/stark-t/PAI/issues/24#issuecomment-1261177484

Here are the 23 cases:

                                                                     FileName ImageWidth_via ImageHeight_via ImageWidth_PIL ImageHeight_PIL Orientation_exiftool Orientation_PIL_id
 1:          Araneae_Linyphiidae_Tenuiphantes_flavipes_1960849752_2109077.jpg            459             700            700             459        Rotate 270 CW                  8
 2:                        Diptera_Anthomyiidae_Delia_lamelliseta_2075125.jpg            336             448            448             336        Rotate 270 CW                  8
 3:                           Diptera_Anthomyiidae_Hylemya_vagans_3259993.jpg            454             605            605             454        Rotate 270 CW                  8
 4:                          Diptera_Anthomyiidae_Hylemya_variata_3549164.jpg            454             606            606             454        Rotate 270 CW                  8
 5:                        Diptera_Calliphoridae_Lucilia_silvarum_4219901.jpg            450             600            600             450        Rotate 270 CW                  8
 6:                      Diptera_Calliphoridae_Pollenia_amentaria_4323790.jpg            450             600            600             450        Rotate 270 CW                  8
 7:                  Diptera_Calliphoridae_Protocalliphora_azurea_2347646.jpg            336             448            448             336        Rotate 270 CW                  8
 8:                  Diptera_Calliphoridae_Protocalliphora_azurea_2989785.jpg            451             601            601             451        Rotate 270 CW                  8
 9:                              Diptera_Empididae_Empis_acinerea_4297830.jpg            452             602            602             452        Rotate 270 CW                  8
10:                              Diptera_Empididae_Empis_acinerea_4297835.jpg            453             605            605             453        Rotate 270 CW                  8
11:                              Diptera_Empididae_Empis_acinerea_4297837.jpg            452             602            602             452        Rotate 270 CW                  8
12:                              Diptera_Fanniidae_Fannia_fuscula_3549203.jpg            452             603            603             452        Rotate 270 CW                  8
13:                              Diptera_Fanniidae_Fannia_fuscula_3549204.jpg            451             601            601             451        Rotate 270 CW                  8
14:                              Diptera_Fanniidae_Fannia_fuscula_3549205.jpg            451             601            601             451        Rotate 270 CW                  8
15:                              Diptera_Fanniidae_Fannia_fuscula_3549206.jpg            452             602            602             452        Rotate 270 CW                  8
16:                             Diptera_Muscidae_Muscina_prolapsa_2987979.jpg            444             592            592             444        Rotate 270 CW                  8
17:                     Diptera_Sarcophagidae_Miltogramma_germari_2413985.jpg            336             448            448             336        Rotate 270 CW                  8
18:                Diptera_Scathophagidae_Scathophaga_stercoraria_3113147.jpg            530             706            706             530        Rotate 270 CW                  8
19:                           Diptera_Tabanidae_Tabanus_sudeticus_2344115.jpg            336             448            448             336        Rotate 270 CW                  8
20:                     Diptera_Tachinidae_Epicampocera_succincta_2580387.jpg            336             448            448             336         Rotate 90 CW                  6
21: Hemiptera_Rhyparochromidae_Scolopostethus_puberulus_1960406292_976888.jpg            480             640            640             480         Rotate 90 CW                  6
22:           Hymenoptera_Formicidae_Myrmica_ruginodis_1958528150_2689976.jpg            450             599            599             450        Rotate 270 CW                  8
23:     Lepidoptera_Yponomeutidae_Yponomeuta_plumbella_1963143821_1880879.jpg            541             722            722             541        Rotate 270 CW                  8
                                                                     FileName ImageWidth_via ImageHeight_via ImageWidth_PIL ImageHeight_PIL Orientation_exiftool Orientation_PIL_id

valentinitnelav commented 2 years ago

EDIT

This is obsolete, there is no bug! See https://github.com/stark-t/PAI/issues/24#issuecomment-1261177484

If the intention was indeed to flip image width with height in YOLOv5, then I think a possible fix for exif_size in utils/dataloaders.py could be something like this:

def exif_size(img):
    # Returns exif-corrected PIL size
    s = img.size  # (width, height)
    try:
        # rotation = dict(img._getexif().items())[orientation] # this doesn't seem to return orientation
        exif_dict = img._getexif() # get a dictionary of exif attributes from the image
        rotation = exif_dict[274] # idex 274 stands for Orientation
        if rotation in [6, 8]:  # rotation 270 or 90
            s = (s[1], s[0])
    except Exception:
        pass

    return s

Or, using PIL.ExifTags based on this SO answer:


import PIL.ExifTags

def exif_size(img):
    # Returns exif-corrected PIL size
    s = img.size  # (width, height)
    try:
        exif = { PIL.ExifTags.TAGS[k]: v
                 for k, v in img._getexif().items()
                 if k in PIL.ExifTags.TAGS }
        rotation = exif['Orientation']
        if rotation in [6, 8]:  # rotation 270 or 90
            s = (s[1], s[0])
    except Exception:
        pass

    return s

valentinitnelav commented 2 years ago

EDIT: There is no bug, see https://github.com/stark-t/PAI/issues/24#issuecomment-1261177484 YOLOv5 & detectron2 have just different ways to read the EXIF orientation.

FYI: YOLOv7 also didn't do anything new about the bug in the line rotation = dict(img._getexif().items())[orientation] in exif_size function. They have the function in the script utils/datasets.py

However, the developers of detectron2 might have dedicated more consideration to this. I didn't test it, but it looks like they use the proper index for the exif orientation tag (274 as per https://www.exiv2.org/tags.html). They deal with this in the script detectron2/detectron2/data/detection_utils.py

valentinitnelav commented 2 years ago

Hi @stark-t , here is what I found out.

It looks like YOLOv5 uses both PIL and cv2 (OpenCV). What is confusing is that these two libraries can differ when they report image width and height. However, cv2's Image.shape aligns well with what VGG VIA reports. See an example here

So, if YOLOv5 doesn't transpose the images with exif_transpose because exif_size doesn't catch the orientation (both functions are in utils/general.py), AND then it loads the images for training using cv2.imread(), then I think we are lucky and we do not have to adjust the coordinates of the boxes because, in that scenario, they align with VGG VIA coordinates. EDIT: YOLOv5 reads the EXIF orientation and will flip width with height accordingly. See https://github.com/stark-t/PAI/issues/24#issuecomment-1261177484

I am just not 100% sure about what is actually happening and I got a bit tired from it :D

If you have some time, could you confirm that YOLOv5 indeed loads the images for training using cv2.imread() and that no transposing operation takes place?

valentinitnelav commented 2 years ago

I also tried to remove the orientation attribute from the EXIF metadata and see if that might help/solve the issue, but sadly, this adds further mismatches & confusion. I detailed my trials in this notebook

valentinitnelav commented 2 years ago

I revisited the code in YOLOv5 and I realized that I missed some really important piece of information. So, I looked again in dataloaders.py. So, if I understand now correctly, orientation is a global variable because this is done before the function in the script (at the top, after the imports and the # Parameters section):

# Get orientation exif tag
for orientation in ExifTags.TAGS.keys():
    if ExifTags.TAGS[orientation] == 'Orientation':
        break

Then YOLOv5 uses the orientation global variable in the exif_size function:

def exif_size(img):
    # Returns exif-corrected PIL size
    s = img.size  # (width, height)
    with contextlib.suppress(Exception):
        rotation = dict(img._getexif().items())[orientation]
        if rotation in [6, 8]:  # rotation 270 or 90
            s = (s[1], s[0])
    return s

I was just trying to "debug" without being aware of the bigger picture (that is, global variables), sorry :/

However, there still remains a source of confusion when we create the annotation files with VGG VIA software annotator. When I run some tests, it seems that VGG VIA reports image width and height in accordance with what CV2 reports, but not with how PIL reports them. I do not know for sure if this is a problem or not.

The pipeline that I had in place was to get the image width and height from VGG VIA (with its COCO format export), and together with the bounding boxes created in the same "coordinate system" offered by VGG VIA, I computed the relative coordinates needed for YOLO. I guess that if YOLO does any kind of image rotation or flipping of the width and height, it also rotates/flips the bounding box information associated with that image, and then everything should be ok.

With that hope in mind, I will close this issue for now. Fingers crossed that it works well for the field images as well.