Open RaisnnowLawrence opened 5 months ago
Hi @RaisnnowLawrence --- thanks for raising this issue!
I don't have full context at the moment because it's been a while since we collected these annotations. Can you provide a bit more detail about where these image sizes are deriving from? Is it the case that: we provided one image size in our annotations, whereas, the raw images are a different size? Because the aspect ratios are the same, I think that a scaling operation should suffice to align them because the aspect ratios are consistent.
HI @jmhessel -- thank you for your reply
I looked for a typical annotation:
json
{
"inputs": {
"bboxes": [
{
"height": 107,
"left": 533,
"top": 176,
"width": 124
},
{
"height": 134,
"left": 930,
"top": 182,
"width": 70
}
],
"clue": "police gear hanging up.",
"confidence": 2.0,
"image": {
"height": 480,
"url": "http://s3-us-west-2.amazonaws.com/ai2-rowanz/vcr1images/movieclips_Eye_See_You/PzZHiGYjhss@23.jpg",
"width": 854
},
"obs_idx": 2
},
"instance_id": "e4ac2e273254069f8b7ff952dd394bef",
"targets": {
"inference": "this room is in a police headquaters."
}
}
json
it's in file Sherlock training corpus v1.1
bbox annotation 2 <"left": 930, "width": 70> higher than marked value <"width": 854>
And the actual image size is actually (720, 1280, 3)
I don't know how to solve annotations like this, and there are quite a lot of them.
Looking forward to your reply
Hi @RaisnnowLawrence ,
Thanks for raising this issue! Getting back quickly ---
https://storage.googleapis.com/ai2-jack-public/sherlock_mturk/images_with_bboxes/e4ac2e273254069f8b7ff952dd394bef.jpg
. This specific example seems to be off, and might have been the result of an annotator moving quickly through our interface. To check to see if there is a systematic issue with the corpus, myself and my co-first-author @jenahwang went through 200 random validation examples, spanning 100 VG images and 100 VCR images. We rated them as "good", "meh", and "bad". 97% = 194/200
were "good", 2.5% = 5/200
were "meh", and 1/200 = .5%
were bad (had we had the sample you provided in our set, it would have been rated "bad".To summarize:
Yes, most of the time, but the size information of the wrong image has no impact on the laboratory. When I use the data set, the bbox will be out of range, which will make the annotation unusable. Looking forward to your solution. thank you for your reply.
Hey !
To clarify my points from above:
Given those two responses, I'm not sure what else I can help you with. But do let me know and I'm happy to take a look!
I'm trying to crop annotated areas of an image, so these incorrect annotations are bothering me a bit. Another problem arises if you use the size of the image itself, as shown below.
json
{
"inputs": {
"bboxes": [
{
"height": 697,
"left": 17,
"top": 215,
"width": 1903
}
],
"clue": "train going by near building",
"confidence": 3.0,
"image": {
"height": 1080,
"url": "http://s3-us-west-2.amazonaws.com/ai2-rowanz/vcr1images/movieclips_Hostel/NVB5kj4k6O4@1.jpg",
"width": 1920
},
"obs_idx": 0
},
"instance_id": "fba83958dca20b1f6a9f7b86a4b4a34d",
"targets": {
"inference": "this is a train station"
}
}
json
The size and annotation of the image are based on the 10801920 standard, but the actual image size is 7201280, and the annotation width is 1903, so the bbox must be implemented according to the annotation width. This can create a problem where some data doesn't know whether to use the actual size or the annotated size.
I counted all the data, and there were 2654 pieces of data that could not be matched in size. As you said, the problematic data is only a small part, and I still hope this issue can be properly resolved. I have included these statistics as an attached file, hoping it will be of some help to you in your subsequent work. Thank you for your reply and wish you a happy life. height_and_width_wrong.txt
Thank you! This makes sense --- there seem to be ~1% of cases where the bounding box is potentially out of bounds, even when ignoring the height/width in the dataset files we distribute and using the original images. Let me run these cases by @jenahwang and get back to you
hello I found that the annotated size of some images was inconsistent with the actual size, which caused serious problems when using BBOX to process images. Can this problem be solved? like this:
94e50aa85ac6bf99fe46206963c759be sherlock/Datasets/SherlockPack/vcr1images/movieclips_Eye_See_You/PzZHiGYjhss@23.jpg (720, 1280, 3) 480 854 3
6f87373795ef51c1efdeb193734c14a2 sherlock/Datasets/SherlockPack/vcr1images/movieclips_Eye_See_You/PzZHiGYjhss@23.jpg (720, 1280, 3) 480 854 3
e4ac2e273254069f8b7ff952dd394bef sherlock/Datasets/SherlockPack/vcr1images/movieclips_Eye_See_You/PzZHiGYjhss@23.jpg (720, 1280, 3) 480 854 3
thank you