huangyangyu / SeqFace

SeqFace : Making full use of sequence information for face recognition
https://arxiv.org/pdf/1803.06524.pdf
MIT License
129 stars 32 forks source link

confusion about NoiseTolerantFRLayer<Dtype>::Backward_cpu #14

Open 994374821 opened 5 years ago

994374821 commented 5 years ago

I have some confusion about backward of NoiseTolerantFRLayer, why when skip_ is True, bottomdiff multiply zero? why not just return as the situation "iter<startiter"?

` void NoiseTolerantFRLayer::Backward_cpu(const vector<Blob>& top, const vector& propagate_down, const vector<Blob>& bottom) { if (propagate_down[0]) { const Dtype label_data = bottom[2]->cpu_data(); const Dtype top_diff = top[0]->cpu_diff(); Dtype bottom_diff = bottom[0]->mutable_cpu_diff(); const Dtype weightdata = weights.cpu_data();

      int count = bottom[0]->count();
      int num = bottom[0]->num();
      int dim = count / num;

      if (top[0] != bottom[0]) caffe_copy(count, top_diff, bottom_diff);

      if (this->phase_ != TRAIN) return;

      if (iter_ < start_iter_) return;

      // backward
      for (int i = 0; i < num; i++)
      {
          int gt = static_cast<int>(label_data[i]);
          if (gt < 0) continue;
          for (int j = 0; j < dim; j++)
          {
              bottom_diff[i * dim + j] *= skip_ ? Dtype(0.0) : weight_data[i];
          }
      }
  }

}`

huangyangyu commented 5 years ago

@994374821 Thank you for your good question. "skip_ == true" means some unexpected situation happened, which is rarely happened. If this situation happened unfortunately, we choose to skip the diff of current batch instead of backward the diff. Because the training dataset is noisy, we don't want to learn noisy dataset directly.