balancap / SSD-Tensorflow

Single Shot MultiBox Detector in TensorFlow
4.11k stars 1.89k forks source link

I modified the loss function to get good results. #366

Open hccho2 opened 4 years ago

hccho2 commented 4 years ago

Face Detection Data http://vis-www.cs.umass.edu/fddb/

Figure_1

  1. Remove the 'crop part' of 'preprocess_for_train()' in ssd_vgg_preprocessing.py While cropping, all true boxes are removed, which causes problems with loss calculations.
         dst_image, labels, bboxes, distort_bbox = \
             distorted_bounding_box_crop(image, labels, bboxes,
                                         min_object_covered=MIN_OBJECT_COVERED,
                                         aspect_ratio_range=CROP_RATIO_RANGE)

or MIN_OBJECT_COVERED: 0.25 ---> 1.0

  1. Modify loss function
def ssd_losses(logits, localisations,
               gclasses, glocalisations, gscores,
               match_threshold=0.5,
               negative_ratio=3.,
               alpha=1.,
               label_smoothing=0.,
               device='/cpu:0',
               scope=None):
    with tf.name_scope(scope, 'ssd_losses'):
        lshape = tfe.get_shape(logits[0], 5)
        num_classes = lshape[-1]
        batch_size = lshape[0]

        # Flatten out all vectors!
        flogits = []
        fgclasses = []
        fgscores = []
        flocalisations = []
        fglocalisations = []
        for i in range(len(logits)):
            flogits.append(tf.reshape(logits[i], [-1, num_classes]))
            fgclasses.append(tf.reshape(gclasses[i], [-1]))
            fgscores.append(tf.reshape(gscores[i], [-1]))
            flocalisations.append(tf.reshape(localisations[i], [-1, 4]))
            fglocalisations.append(tf.reshape(glocalisations[i], [-1, 4]))
        # And concat the crap!
        logits = tf.concat(flogits, axis=0)
        gclasses = tf.concat(fgclasses, axis=0)
        gscores = tf.concat(fgscores, axis=0)
        localisations = tf.concat(flocalisations, axis=0)
        glocalisations = tf.concat(fglocalisations, axis=0)
        dtype = logits.dtype

        # Compute positive matching mask...
        pmask = gscores > match_threshold
        fpmask = tf.cast(pmask, dtype)
        n_positives = tf.reduce_sum(fpmask)

        # Hard negative mining...
        no_classes = tf.cast(pmask, tf.int32)   # positive = 1, 아닌것은 0이 되는데  ---> 원하는 것은 backgroud의 label은 0
        predictions = slim.softmax(logits)
        nmask = tf.logical_and(tf.logical_not(pmask), gscores > -0.5)  # positive 아닌것
        fnmask = tf.cast(nmask, dtype)

        nvalues = tf.where(nmask, predictions[:, 0], 1. - fnmask) 

        nvalues_flat = tf.reshape(nvalues, [-1])
        # Number of negative entries to select.
        max_neg_entries = tf.cast(tf.reduce_sum(fnmask), tf.int32)
        n_neg = tf.cast(negative_ratio * n_positives, tf.int32) + 1
        n_neg = tf.minimum(n_neg, max_neg_entries)

        val, idxes = tf.nn.top_k(-nvalues_flat, k=n_neg)
        max_hard_pred = -val[-1]
        # Final negative mask.
        nmask = tf.logical_and(nmask, nvalues <= max_hard_pred)
        fnmask = tf.cast(nmask, dtype)

        # Add cross-entropy loss.

        batch_size = tf.cast(batch_size,dtype)
        n_negative = tf.cast(nmask, dtype)
        n_positives = tf.maximum(n_positives,1)

        with tf.name_scope('cross_entropy_pos'):
            loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,labels=gclasses)
            loss = tf.div(tf.reduce_sum(loss * fpmask), n_positives, name='value')
            tf.losses.add_loss(loss)

        with tf.name_scope('cross_entropy_neg'):
            loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,labels=no_classes)
            loss = tf.div(tf.reduce_sum(loss * fnmask), n_positives, name='value')
            tf.losses.add_loss(loss)

        # Add localization loss: smooth L1, L2, ...
        with tf.name_scope('localization'):
            # Weights Tensor: positive mask + random negative.
            weights = tf.expand_dims(alpha * fpmask, axis=-1)
            loss = custom_layers.abs_smooth(localisations - glocalisations)
            loss = tf.div(tf.reduce_sum(loss * weights), n_positives, name='value')
            tf.losses.add_loss(loss)

3.learning rate = 0.0001

VolleyballBird commented 4 years ago

what do i do to get start to train? first run tf_convert_data.py, then run train_ssd_network.py with dataset_dir=/tfrecords ? when i did this ,get some files but no ssd_300_vgg and .ckpt instead get *.ckpt-num.index such files

haobabuhaoba commented 4 years ago

I0313 15:10:55.695348 140095594268544 learning.py:507] global step 2290: loss = 24.0642 (4.784 sec/step) INFO:tensorflow:Saving checkpoint to path ./train_model/model.ckpt I0313 15:11:12.685011 140091937531648 supervisor.py:1117] Saving checkpoint to path ./train_model/model.ckpt INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.DataLossError'>, corrupted record at 12 [[{{node pascalvoc_2007_data_provider/parallel_read/ReaderReadV2}}]] I0313 15:11:19.223580 140091971102464 coordinator.py:224] Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.DataLossError'>, corrupted record at 12 [[{{node pascalvoc_2007_data_provider/parallel_read/ReaderReadV2}}]] INFO:tensorflow:Caught OutOfRangeError. Stopping Training. 2 root error(s) found. (0) Out of range: FIFOQueue '_6_prefetch_queue/fifo_queue' is closed and has insufficient elements (requested 1, current size 0) [[node fifo_queue_Dequeue (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]] (1) Out of range: FIFOQueue '_6_prefetch_queue/fifo_queue' is closed and has insufficient elements (requested 1, current size 0) [[node fifo_queue_Dequeue (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]] [[fifo_queue_Dequeue/_321]] 0 successful operations. 0 derived errors ignored.