open-AIMS / ozfish

Public dataset of Australian fish species for advancing machine learning research
30 stars 3 forks source link

Issues with bounding box annotations #5

Open BenMaslen opened 2 years ago

BenMaslen commented 2 years ago

Hi, I have found large issues with the bounding box annotations for this dataset. From observing only 21 frames in this dataset I found 115 false positives and 86 false negatives by the annotators, making this dataset unusable for our purposes without proper cleaning.

From my initial inspection, majority of these dicrepencies arise from 2 main issues:

  1. A large issue with the annotations could have stemmed from if the annotators only looked at the reference frame instead of a video snippet around said frame to confirm if the objects they drew bounding boxes around are indeed fish. By looking at video snippets for the above 21 frames I was able to identify false positives and negatives not picked up by the annotators.

  2. Another issue is about scale of what should be considered a fish. It seems inconsistent for smaller or far away fish whether they are included in annotations or not accross frames.

I am wondering whether these frames will be cleaned as a part of the BRII project and if this repository will be updated accordingly? I am happy to share more details of the frames I investigated if it is of use for data cleaning purposes.

The frames I have been using are found in FDFML/Labelled/Frames, and I was using the labels in FDFML/Labelled/speciesboxes. I note that the same issues exist for the labels in FDFML/Labelled/Manifests, however there are differences in the labels for frames coming from Videos of group E (I will make a seperate post about this).

BenMaslen commented 2 years ago

Hi @threehundred, to give more context to the above issue, below is a link to a list of the frames that I investigated: https://docs.google.com/spreadsheets/d/1LFyp0ZaHAqiQDtCS5fpYY6pm_x5K-PaI56NmHyMHCYU/edit?usp=sharing

As an example of the first main issue I have discussed above, here are the annotations from Frame A000014_L.avi.9598.png: image

In the above annotations you will notice a large amount of white fish shaped objects, which when you watch the video you will realise are not actually fish, resulting in the below cleaned annotations, which removed 66 falsely identified 'fish' annotations: image

Below is a video snippet around frame A000014_L.avi.9598.png, where this becomes quite clear:

https://youtu.be/sARpFo_NVbY

BenMaslen commented 2 years ago

Below is an example of the second issue I pointed out in my original post. That it is incosistent accross frames, whether small or far away fish have bounding boxes or not. In the above post for frame A000014_L.avi.9598.png, you will notice that the small and far away fish are annotated.

However, when observing the bounding boxes for frame A000010_L.avi.43493.png you will notice that they are not: image

Upon cleaning this image, I found 3 false positives and 39 false negatives, resulting in the below cleaned annotations where small and far away fish do have bounding boxes: image