Erotemic / shitspotter

An open source algorithm and dataset for finding poop in pictures. A work in progress.
45 stars 4 forks source link

COCO Annotations - `exif_ori` and `bbox` annotations may not be respected? #18

Closed njho closed 4 months ago

njho commented 4 months ago

Hey!

Hey quick question, I took a look at geowatch and was curious if the exif_ori in data.kwcoco.json was being respected - I couldn't find anywhere that it is being used.

I think this is problematic because:

My intent is to:

  1. Verify that I'm not missing something here
  2. Make a PR that prepares the data, and rotates them alongside any other data_prep required if someone wants to use it for some other out of the box COCO pipeline

As well, quick question, while you're here. I see most standard coco datasets are split into training2017 and val2017 in explicit folders. Was the train/val/test split explicitly prescribed by coco? I guess this is not necessary in our case, as we can test/train split programmatically, with pointers to the appropriate file location, without explicitly creating those directories?

njho commented 4 months ago

Nm I'm dumb.

Everything's perfectly fine! exif_ori is for information only, and all annotations are great!

Updated: Please see continued convo below

Erotemic commented 4 months ago

As well, quick question, while you're here. I see most standard coco datasets are split into training2017 and val2017 in explicit folders. Was the train/val/test split explicitly prescribed by coco?

That's because people write bad code and hard code paths to train/val. I aim to avoid that. Instead, each split is represented by a single manifest file. These are created by logic in shitspotter/make_splits.py, and I may change details of the splits in the future (data from validation might go into train, but I will never move a train image to validation). Another reason to use manifests as single sources of truth for train/vali splits is because it make it easier to support multiple different splits for cross validation, although I don't currently have plans to do anything with that. Of course, you can always make your own custom splits. Logic in the aformentioned file should help make that easier.

The latest dataset upload should contain a train.kwcoco.zip and vali.kwcoco.zip that point to the items that belong to the appropriate split. There currently is no held out test set, training and validation are still a bit data-light, so I'm just evaluting on validation for now.

Something to note: the .kwcoco.zip files don't need to be decompressed, the kwcoco library will automatically handle that when you load them.

Tip: if you have run pip install kwcoco, then try the kwcoco command line tool:

# Report details on each split
kwcoco stats train.kwcoco.zip
kwcoco stats vali.kwcoco.zip

# Display annotated images (requires optional kwcoco dependencies like PyQt5: `pip install kwcoco[optional]`)
kwcoco show vali.kwcoco.zip
njho commented 4 months ago

Something to note: the .kwcoco.zip files don't need to be decompressed, the kwcoco library will automatically handle that when you load them.

Very interesting note. Thank you. Also thanks for taking to highlight the info about the hardcoded train/val split.

Earlier last week I tried just doing a hail mary run at training using YoloX and it didn't converge 😸 (not really surprising on a first go). That being said, I've been taking a look at the data and I do think there might be an issue.

Please bear with me as I explain, and then afterwards would you be able to validate I'm not imagining things?

Example: assets/poop-2020-12-28/PXL_20201113_124306491.jpg

# From the data.kwcoco.json
    {
      "id": 7,
      "file_name": "assets/poop-2020-12-28/PXL_20201113_124306491.jpg",
      "name": "PXL_20201113_124306491",
      "nbytes": 5734078,
      "datetime": "2020-11-13T07:43:06",
      "geos_point": {
        "type": "Point",
        "coordinates": [
          "-26555753/360000",
          "7678247/180000"
        ],
        "properties": {
          "crs": "CRS84"
        }
      },
      "width": 4032,
      "height": 3024
    },

Viewing the Data Using FiftyOne

I noticed using FiftyOne to view the data, it seems like it matched up with the photo. I assume FiftyOne is generally correct, as it's recommended and I also validated it against the COCO dog bounding boxes, etc.

image

FiftyOne Data Viewer

Taking A Look with Python

So I took it over to using Pillow to validate. The exif tag on the image === 1 (so no orientation required):

    image = Image.open(f"{data_path}/{this_image['file_name']}")
    for orientation in ExifTags.TAGS.keys():
        if ExifTags.TAGS[orientation] == "Orientation":
            break

    print("This is the exif data")
    exif = image._getexif()
    orientation_data = exif.get(274)
    print(f"Orientation: {exif[orientation]}")
>> Orientation: 1

Negating the Exif ORI seen in data.kwcoco.json

I found that negating the rotation in the data.kwcoco.json that the bounding boxes would match up. So typically with exif_ori = 6.0 we would rotate CW 270 and instead, we rotate CW -270 or CCW 270 and then it appears correctly:

image

Pillow Output after Rotating CW -270

Questions:

I'll be happy to make a PR that has a method to accomplish this via scripting if it's necessary (for non-geowatch) workflows. Happy Saturday! 🎈 🎉

Erotemic commented 4 months ago

The problem is that FiftyOne is respecting the EXIF data, whereas my pipeline assumes the annotations are in the raw image space (not the rotated EXIF space). The main driver I use to read images is kwimage.imread, which is a frontend for things like cv2.imread or gdal's image reader. These do not do any postprocessing based on EXIF, so that's the same format I keep my annotations in.

The labelme tool that I used to annotate does store annotations in EXIF space, so I needed to undo that here: https://github.com/Erotemic/shitspotter/blob/main/shitspotter/gather.py#L226 when I gather them into a kwcoco file.

If you reran that labelme->kwcoco process but without the EXIF handling, then perhaps that would allow you to use them as input to another data pipeline which respects EXIF on image reads.

This issue raises an interesting dilema: is there a way to make it obvious or configurable as to which space the images should be loaded in, or the annotations should be stored? It's tricky because what if you are using an annotator that does the EXIF postprocessing, but your training pipeline doesn't do it. My training pipeline ignores EXIF, and I imaging many off the shelf tools that use cv2.imread will as well. There isn't really a good way to store the data, because it requires that tools interpret it correctly, and there are conflicting interpretations.

I'm wondering if a possible mitigation is if I add an option in each annotation to denote the space it was annotated in. Then give the user a way to convert the kwcoco file itself so the annotations are either stored in EXIF space or raw space (or more generally any other space). Then its just a matter of making sure you "conform" your kwcoco file to the tool you are using.

If you could write a proof-of-concept that read in a kwcoco file, and then wrote out a variant where the annotations conformed to EXIF space, that would be helpful starting point.

njho commented 4 months ago

Yes! After some more investigation, it seems like saving the files as per native, and removing the exif makes it appear in the tool correctly.

Second question is seeing if my training pipeline does respect the exif somehow...

If you could write a proof-of-concept that read in a kwcoco file, and then wrote out a variant where the annotations conformed to EXIF space, that would be helpful starting point.

After evaluating the above I'll take a look at this comment as well. Thank you!

njho commented 4 months ago

So the training converged. I'm not sure if it's because I rotated the images, or it as the library. I did end up stripping the Exif data, but I also changed training libraries at the same time so I can't be completely sure of what was the end resultant in allowing that to occur.