For Object Detection support the PASCAL VOC format

NVIDIA / DIGITS

Deep Learning GPU Training System

https://developer.nvidia.com/digits

BSD 3-Clause "New" or "Revised" License

4.12k stars 1.38k forks source link

For Object Detection support the PASCAL VOC format #879

Open Greendogo opened 8 years ago

Greendogo commented 8 years ago

Since Image-Net already provides so many of their synsets with bounding boxes, it would be pretty helpful if DIGITS would support the XML PASCAL VOC format, which Image-Net uses.

This would require ignoring any classes not actually being used, but referred to by the provided files (since a picture from the airplane synset might actually have things from other synsets in it, which the DIGITS user might not be including in their locally brewed dataset).

Greendogo commented 8 years ago

Alternatively, you could assume the user wishes to include any classes available in the bounding box files, but you run the risk of someone using a file that has a lower number of references to an incidental class. Perhaps use a threshold for the number of class references necessary for DIGITS to acknowledge the class as important.

gheinrich commented 8 years ago

Hi, I think it's a good idea to support XML PASCAL VOC format for object detection. The KITTI format may be confusing at times since a lot of the required parameters are not targeted at object detection (e.g. those that are used for 3D pose estimation). We could make it such that .txt files are interpreted as KITTI labels and those in .xml files are interpreted as PASCAL VOC labels.

Did you get a change to review this doc? This explains how classes are dealt with. In the generated dataset, all object classes are included. By default DetectNet only considers objects from class index 1 therefore there is no filtering needed during dataset creation.

Greendogo commented 8 years ago

So my assumption is that at some point in the future DIGITS will be able to do multi-class/multi-label object detection, so my suggestion about filtering is about that point in time.

Is there an Issue yet for multi-class object detection?

gheinrich commented 8 years ago

Again, all classes are included in the dataset. The filtering, if any, takes place in the network. With DetectNet, this is done there. At the day of writing this, DetectNet supports only one class.

sulth commented 6 years ago

Hi I have tried all the ways proposed for multiple class detection.But Detectnet is not tunable for even the kitty dataset.Does detectnet really helps in multiple class detection?

happygao commented 6 years ago

@Greendogo Hello, I want to deal with the data with xml format nowadays, are you to do anything with the DIGITS to deal with the xml format? I think we could talk about it.

ibrahim014 commented 6 years ago

Hi Greg.

I am facing this error while uploading VOC dataset on DIGITS. ERROR: ValueError: Expect same number of images in feature and label folders (17125!=2913)

The dataset I download from the link which you mention in your tutorial. However, The label images are only 2913 in the dataset. Can you please help me how can I encounter this problem.? Thank You