Reduced image resolution and partial dataset???

VisDrone / DroneVehicle

Drone-based RGB-Infrared Cross-Modality Vehicle Detection via Uncertainty-Aware Learning

470 stars 51 forks source link

Reduced image resolution and partial dataset??? #32

Open mljack opened 2 years ago

mljack commented 2 years ago

In paper, author claims the dataset has 28,439 image pairs with 840x712 resolution. In github, it only has 19,459 image pairs with 740x612 resolution due to white border in 100 pixels.

19,459/28,439(740612)/(840*712) = 51.8%

so the actual published dataset is only half size of the clamed dataset in paper.... To get the paper more in cited and impact more, please publish more to make your work reproducible.

TheMadScientiist commented 2 years ago

@mljack were you able to get the dataset?

mljack commented 2 years ago

@mljack were you able to get the dataset?

I got files mentioned above from the BaiduYun link in README.md

I suggest you have a look at this dataset. It contains more high quality images from all kinds of view angles, locations and altitudes: VSAI: A Multi-View Dataset for Vehicle Detection in Complex Scenarios Using Aerial Images https://www.kaggle.com/datasets/dronevision/vsaiv1

TheMadScientiist commented 2 years ago

@mljack Did you train on both?

mljack commented 2 years ago

Nope~ but I do review these photos and labels in these dataset and see whether it fits to train an vehicle detector for aerial images.

mljack commented 2 years ago

With the updated test set in README.md, the total image pair number is 28439, which meets the number mentioned in the paper. There are some other issues: a. Images are sub/up-sampled to fit the resolution. However, the method of sub/up-sampling introduces aliasing on edges. The quality of images in this dataset are quite comparing my test image by DJI drones (with similar resolution and using linear/cubic interpolation in up/sub-sampling) b. The daytime/night distribution are quite unbalanced. My rough count of daytime images without fog are 4082, which is 14% of the whole dataset.