Open bkkm78 opened 1 year ago
Here is a list of potential duplicate images I found in the dataset: https://gist.github.com/bkkm78/95fb4faf9ca8303005349a5c396af3c0
@bkkm78 Thanks a lot for reporting this. The reported issues have been added to my schedule and I need some time to fix them. About the issues:
Thanks a lot for being helpful. Let me know if you'd like to make further report or discussion.
Best, Yu
@Arthur151 Thank you for your reply! Here is a list of images that may contain incomplete annotations. https://gist.github.com/bkkm78/e38d089a0cd833bf793c4fb2da7102c1
This list may not be complete, but may be helpful as a starting point. (Being exhaustive is indeed difficult. :))
It may also be helpful if you could release the meta data for each image, such as the source dataset from which the image is collected.
@bkkm78 Thanks for your efforts! The image list would be very helpful!
The image name of different datasets are quite easy to tell. For example, the image name of CrowdPose would be 6 number starting with 1, like 1xxxxx.jpg. The name of images we collect from InterNet would be 7 number. The image name of OCHuman would be 6 number starting with 0. Something like that.
Thanks for the effort to create this dataset!
When inspecting the annotations, I found some quality issues with the annotations. For some images, the annotations do not seem to be exhaustive. Some clearly visible persons in the foreground are missing in the annotations, such as the one the left in the following image (
105520.jpg
):There are also cases where the full box annotation is only covering part of the person, even if other parts are clearly visible, such the old man riding a horse in this image (
100134.jpg
):There also seem to be overlap between the training set and the eval/test set. For example,
109136.jpg
in the validation set appears to be a resized version of000154.jpg
in the training set:Would the authors mind looking into these issues? Thanks!