Closed valentinitnelav closed 2 years ago
This is not a bug, I was not aware of the global variables - see https://github.com/stark-t/PAI/issues/24#issuecomment-1261177484
@stark-t , I try to debug and understand the behavior of exif_size
and I run into a problem and I need your help to double-check this.
The function exif_size
is defined like this in utils/dataloaders.py:
def exif_size(img):
# Returns exif-corrected PIL size
s = img.size # (width, height)
try:
rotation = dict(img._getexif().items())[orientation]
if rotation in [6, 8]: # rotation 270 or 90
s = (s[1], s[0])
except Exception:
pass
return s
It is used in verify_image_label()
function in the same script like this:
im = Image.open(im_file)
im.verify() # PIL verify
shape = exif_size(im) # image size
But when I test it line by line with an example, the part rotation = dict(img._getexif().items())[orientation]
fails with the error NameError: name 'orientation' is not defined
. While that part is wrapped inside a try
call, then the function never actually runs it because always gives an error and try
skips it, or do I miss something obvious? This seems to me like a bug on YOLO's side.
Here is what I mean:
First, download this image on your computer, then run exif_size
and get separately the image size and you get the same results despite the fact that the image is indeed rotated - the exif metadata contains "Rotate 270 CW" as posted here.
import urllib.request
import PIL
urllib.request.urlretrieve("https://observation.org/photos/2075125.jpg", "Diptera_Anthomyiidae_Delia_lamelliseta_2075125.jpg")
img_path = 'Diptera_Anthomyiidae_Delia_lamelliseta_2075125.jpg'
img = PIL.Image.open(img_path) # you have to open an image like this
exif_size(img) # first load the function from above; this gives (448, 336) It is (width, height) as a tuple
img.size # this also gives (448, 336), So, no rotation has happened despite the image having the orientation 'Rotate 270 CW"
Further, the dict(img._getexif().items())[orientation]
part of exif_size(img)
gives an error if run directly like this:
dict(img._getexif().items())[orientation]
# NameError: name 'orientation' is not defined
One has to get the EXIF metadata as a dictionary maybe like suggested here:
import PIL.ExifTags
exif = {
PIL.ExifTags.TAGS[k]: v
for k, v in img._getexif().items()
if k in PIL.ExifTags.TAGS
}
exif['Orientation'] # only now we can index by the key 'Orientation'
# 8
This looks like a bug in exif_size
, because it doesn't access the EXIF metadata properly. But before reporting this on the YOLO repo, could you double check it as well?
I just checked also how PIL returns the size of an image.
The .size
attribute documentation says that "Image size, in pixels. The size is given as a 2-tuple (width, height)."
For the case above it returns (448, 336)
, which seems already flipped based on EXIF metadata, so that adds further to my confusion.
the lesson here is that each program might do their own flipping when they report image statics and I need to pay extra care...this is getting crazy
And an extra confusion - in this commet the author of YOLOv5, Glenn Jocher, says that "train.py and test.py attempt to extract an exif-corrected size during label caching." And we saw that the proposed exif_size
function actually doesn't do any flipping of img width and height.
Then he adds that "detect.py dataloader does not consider exif." This means that when exif_size
was implemented, it was not targeting detect.py
, but only train.py
and test.py
? And why not also for detect.py
, so that at least there is consistency?
Anyways, right now I will check all the data for P1 to see how much of it was affected by this and hope to get a more clear understanding of how to proceed.
So, I checked each image file in the P1 dataset and found only 23 files that are potentially affected by the flip in image statics (width and height) done automatically by the VIA annotator. This is a small drop in the sea and is reassuring for the P1 dataset. I need to check the field images dataset as well.
In these 23 cases, I read the image statics using PIL and compared the values with those that the VIA annotation tool exported when I prepared the YOLO annotation files.
If YOLO indeed has a bug and doesn't flip the width and height when they have orientation code 6 (Rotate 90 CW) or 8 (Rotate 270 CW), then these 23 images have wrong bounding box coordinates because VIA flipped the image statics and YOLO not. EDIT: There is no bug on YOLO's side as initially thought - see https://github.com/stark-t/PAI/issues/24#issuecomment-1261177484
Here are the 23 cases:
FileName ImageWidth_via ImageHeight_via ImageWidth_PIL ImageHeight_PIL Orientation_exiftool Orientation_PIL_id
1: Araneae_Linyphiidae_Tenuiphantes_flavipes_1960849752_2109077.jpg 459 700 700 459 Rotate 270 CW 8
2: Diptera_Anthomyiidae_Delia_lamelliseta_2075125.jpg 336 448 448 336 Rotate 270 CW 8
3: Diptera_Anthomyiidae_Hylemya_vagans_3259993.jpg 454 605 605 454 Rotate 270 CW 8
4: Diptera_Anthomyiidae_Hylemya_variata_3549164.jpg 454 606 606 454 Rotate 270 CW 8
5: Diptera_Calliphoridae_Lucilia_silvarum_4219901.jpg 450 600 600 450 Rotate 270 CW 8
6: Diptera_Calliphoridae_Pollenia_amentaria_4323790.jpg 450 600 600 450 Rotate 270 CW 8
7: Diptera_Calliphoridae_Protocalliphora_azurea_2347646.jpg 336 448 448 336 Rotate 270 CW 8
8: Diptera_Calliphoridae_Protocalliphora_azurea_2989785.jpg 451 601 601 451 Rotate 270 CW 8
9: Diptera_Empididae_Empis_acinerea_4297830.jpg 452 602 602 452 Rotate 270 CW 8
10: Diptera_Empididae_Empis_acinerea_4297835.jpg 453 605 605 453 Rotate 270 CW 8
11: Diptera_Empididae_Empis_acinerea_4297837.jpg 452 602 602 452 Rotate 270 CW 8
12: Diptera_Fanniidae_Fannia_fuscula_3549203.jpg 452 603 603 452 Rotate 270 CW 8
13: Diptera_Fanniidae_Fannia_fuscula_3549204.jpg 451 601 601 451 Rotate 270 CW 8
14: Diptera_Fanniidae_Fannia_fuscula_3549205.jpg 451 601 601 451 Rotate 270 CW 8
15: Diptera_Fanniidae_Fannia_fuscula_3549206.jpg 452 602 602 452 Rotate 270 CW 8
16: Diptera_Muscidae_Muscina_prolapsa_2987979.jpg 444 592 592 444 Rotate 270 CW 8
17: Diptera_Sarcophagidae_Miltogramma_germari_2413985.jpg 336 448 448 336 Rotate 270 CW 8
18: Diptera_Scathophagidae_Scathophaga_stercoraria_3113147.jpg 530 706 706 530 Rotate 270 CW 8
19: Diptera_Tabanidae_Tabanus_sudeticus_2344115.jpg 336 448 448 336 Rotate 270 CW 8
20: Diptera_Tachinidae_Epicampocera_succincta_2580387.jpg 336 448 448 336 Rotate 90 CW 6
21: Hemiptera_Rhyparochromidae_Scolopostethus_puberulus_1960406292_976888.jpg 480 640 640 480 Rotate 90 CW 6
22: Hymenoptera_Formicidae_Myrmica_ruginodis_1958528150_2689976.jpg 450 599 599 450 Rotate 270 CW 8
23: Lepidoptera_Yponomeutidae_Yponomeuta_plumbella_1963143821_1880879.jpg 541 722 722 541 Rotate 270 CW 8
FileName ImageWidth_via ImageHeight_via ImageWidth_PIL ImageHeight_PIL Orientation_exiftool Orientation_PIL_id
This is obsolete, there is no bug! See https://github.com/stark-t/PAI/issues/24#issuecomment-1261177484
If the intention was indeed to flip image width with height in YOLOv5, then I think a possible fix for exif_size
in utils/dataloaders.py
could be something like this:
def exif_size(img):
# Returns exif-corrected PIL size
s = img.size # (width, height)
try:
# rotation = dict(img._getexif().items())[orientation] # this doesn't seem to return orientation
exif_dict = img._getexif() # get a dictionary of exif attributes from the image
rotation = exif_dict[274] # idex 274 stands for Orientation
if rotation in [6, 8]: # rotation 270 or 90
s = (s[1], s[0])
except Exception:
pass
return s
Or, using PIL.ExifTags
based on this SO answer:
import PIL.ExifTags
def exif_size(img):
# Returns exif-corrected PIL size
s = img.size # (width, height)
try:
exif = { PIL.ExifTags.TAGS[k]: v
for k, v in img._getexif().items()
if k in PIL.ExifTags.TAGS }
rotation = exif['Orientation']
if rotation in [6, 8]: # rotation 270 or 90
s = (s[1], s[0])
except Exception:
pass
return s
EDIT: There is no bug, see https://github.com/stark-t/PAI/issues/24#issuecomment-1261177484 YOLOv5 & detectron2 have just different ways to read the EXIF orientation.
FYI: YOLOv7 also didn't do anything new about the bug in the line rotation = dict(img._getexif().items())[orientation]
in exif_size
function. They have the function in the script utils/datasets.py
However, the developers of detectron2 might have dedicated more consideration to this. I didn't test it, but it looks like they use the proper index for the exif orientation tag (274 as per https://www.exiv2.org/tags.html). They deal with this in the script detectron2/detectron2/data/detection_utils.py
Hi @stark-t , here is what I found out.
It looks like YOLOv5 uses both PIL and cv2 (OpenCV). What is confusing is that these two libraries can differ when they report image width and height. However, cv2's Image.shape
aligns well with what VGG VIA reports. See an example here
So, if YOLOv5 doesn't transpose the images with
EDIT: YOLOv5 reads the EXIF orientation and will flip width with height accordingly. See https://github.com/stark-t/PAI/issues/24#issuecomment-1261177484exif_transpose
because exif_size
doesn't catch the orientation (both functions are in utils/general.py), AND then it loads the images for training using cv2.imread()
, then I think we are lucky and we do not have to adjust the coordinates of the boxes because, in that scenario, they align with VGG VIA coordinates.
I am just not 100% sure about what is actually happening and I got a bit tired from it :D
If you have some time, could you confirm that YOLOv5 indeed loads the images for training using cv2.imread()
and that no transposing operation takes place?
I also tried to remove the orientation attribute from the EXIF metadata and see if that might help/solve the issue, but sadly, this adds further mismatches & confusion. I detailed my trials in this notebook
I revisited the code in YOLOv5 and I realized that I missed some really important piece of information. So, I looked again in dataloaders.py.
So, if I understand now correctly, orientation
is a global variable because this is done before the function in the script (at the top, after the imports and the # Parameters section):
# Get orientation exif tag
for orientation in ExifTags.TAGS.keys():
if ExifTags.TAGS[orientation] == 'Orientation':
break
Then YOLOv5 uses the orientation
global variable in the exif_size
function:
def exif_size(img):
# Returns exif-corrected PIL size
s = img.size # (width, height)
with contextlib.suppress(Exception):
rotation = dict(img._getexif().items())[orientation]
if rotation in [6, 8]: # rotation 270 or 90
s = (s[1], s[0])
return s
I was just trying to "debug" without being aware of the bigger picture (that is, global variables), sorry :/
However, there still remains a source of confusion when we create the annotation files with VGG VIA software annotator. When I run some tests, it seems that VGG VIA reports image width and height in accordance with what CV2 reports, but not with how PIL reports them. I do not know for sure if this is a problem or not.
The pipeline that I had in place was to get the image width and height from VGG VIA (with its COCO format export), and together with the bounding boxes created in the same "coordinate system" offered by VGG VIA, I computed the relative coordinates needed for YOLO. I guess that if YOLO does any kind of image rotation or flipping of the width and height, it also rotates/flips the bounding box information associated with that image, and then everything should be ok.
With that hope in mind, I will close this issue for now. Fingers crossed that it works well for the field images as well.
I thought I solved this issue, but to my surprise, I still got trapped in the confusion of image orientation. There is also an open issue for YOLO here: Exif Image Orientation. Currently, the function
exif_size
was moved toutils/dataloaders.py
.I am worried that part of the errors in IoU could be due to the flip in image width with height. YOLO attempts to flip them as indicated above, when the orientation is 90 or 270 degrees. I also tried to do the same when I built the training dataset with bounding boxes, but today I got surprised that when I run some predictions, some images with 90 or 270 dg orientation were ok, but one had a flipped box and was unclear why despite trying to take care of the orientation.
So far, what I know is that the annotation tool that we use, VIA VGG can read the orientation of the image and displays it according to the orientation information from the metadata - see example here. Also, it exports the image statics as it displays it on the screen. That means that exported image statics should match how YOLO processes them by flipping them.Edit: VIA VGG "relies on the browser to load the image, according to HTML specification" - see example & discussion here. This will not always correspond to how PIL reads image width and height - see below in this thread.I think during our evaluation process, we should also extract the image statics (width, height, and orientation info) and compare that with the metadata that the VIA VGG gave when we did the annotation. I already have that but I didn't check if this matches how YOLO reads it. I will try to look into this issue.