weiliu89 / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
4.77k stars 1.68k forks source link

Training and Testing SSD via new dataset #331

Closed eonurk closed 5 years ago

eonurk commented 7 years ago

Hello I am a newbie on SSD. I tried Faster R-CNN trained(on gtx 1070) & tested it (on gtx 650m) using INRIA Person example. However, it was too slow on my gtx650m for coping with real time. Is there a such step by step instruction for SSD that I can use for training and testing SSD with my own dataset?

Any help would be greatly appreciated! :)

ThomasDelteil commented 7 years ago

I would be interested as well, I have a dataset consisting of .jpg of different sizes and boxes and annotation in .tsv format. I managed to install SSD and run it on the default dataset but I would greatly appreciate some step by step instructions to format my own data in the proper way, and what path need to be modified and where so that the model picks everything up. Ideally I would like to start with a pre-trained model and fine-tune it for my classes rather than starting from scratch.

eonurk commented 7 years ago

@weiliu89 can you help me please?

imserkan commented 7 years ago

@weiliu89 I also need help about the situation. Thanks :)

weiliu89 commented 7 years ago

@eonurk @ThomasDelteil @imserkan Please refer to the README.md for more details on how I train a SSD model from VOC dataset. You can essentially follow similar steps. You can refer to data/coco and data/ILSVRC2016 on how I train SSD model on COCO and ILSVRC DET dataset. There are some basic steps, which are similar to how you prepare for training a classification model:

  1. Create a file list which contains the image_path and annotation_path, as was illustrated at here for VOC or here for COCO.

  2. Create a labelmap file, such as this. You can refer to this code on how to create one for your own dataset.

  3. Create a lmdb file to store the images and annotations. You can refer to this script.

  4. After this step, I would encourage you use this to make sure your dataset is created correctly. Make sure changing model file and labelmap file to your own model and data.

  5. Then you can refer to various scripts at here on how to define the SSD model structure.

  6. The training procedure should be similar to training a classification model as you can monitor the mAP (i.e. search detection_eval in the .log file) as you brew the model.

Then the rest is on your owns about how to change the model architecture and parameters to suit for your own dataset and problem.

lucasjinreal commented 7 years ago

@weiliu89 But why have to using xml format for generate list, using xml may cause every datasets have different nodes, why not using just a single txt file contains boxes and labels and image names information?

weiliu89 commented 7 years ago

@jinfagang Because VOC annotation was saved in xml format. There is a function which can read txt file as you mentioned.

lucasjinreal commented 7 years ago

@weiliu89 Thanks, these code helped me out how did SSD absorb labels and bboxes, but I am still have some questions about AnnotatedDatum data structure:

  1. In src/proto/caffe.proto SSD confined these data structures,
    
    message NormalizedBBox {
    optional float xmin = 1;
    optional float ymin = 2;
    optional float xmax = 3;
    optional float ymax = 4;
    optional int32 label = 5;
    optional bool difficult = 6;
    optional float score = 7;
    optional float size = 8;
    }

// Annotation for each object instance. message Annotation { optional int32 instance_id = 1 [default = 0]; optional NormalizedBBox bbox = 2; }

// Group of annotations for a particular label. message AnnotationGroup { optional int32 group_label = 1; repeated Annotation annotation = 2; } // An extension of Datum which contains "rich" annotations. message AnnotatedDatum { enum AnnotationType { BBOX = 0; } optional Datum datum = 1; // If there are "rich" annotations, specify the type of annotation. // Currently it only supports bounding box. // If there are no "rich" annotations, use label in datum instead. optional AnnotationType type = 2; // Each group contains annotation for a particular class. repeated AnnotationGroup annotation_group = 3; }

when I try to read lmdb, in original caffe way I just need get each type in Datum:

message Datum { optional int32 channels = 1; optional int32 height = 2; optional int32 width = 3; // the actual image data, in bytes optional bytes data = 4; optional int32 label = 5; // Optionally, the datum could also hold float data. repeated float float_data = 6; // If true data contains an encoded image that need to be decoded optional bool encoded = 7 [default = false]; }


I can get image data, label, image width and height etc. But in SSD, how can I get these information in AnnotatedDatum?(Such as image data, labels and bboxes, I can see that these property are nested in different structure, should I get them like JSON array?)

2. After all, using above code generate txt to AnnotatedDatum, how should I save it into lmdb database?
Help big god give me some advises!