weighted softmax loss doesn't work in this version

zhiqiangdon commented 7 years ago

Hi,

I find that the softmax_loss layer doesn't have the implementation of specifying weights for different classes while the older version contains this implementation. Could you please add it? Thanks!

TimoSaemann commented 7 years ago

Hi, it already contains class_weighting.

zhiqiangdon commented 7 years ago

Hi,

Here is part of code from current source file "https://github.com/TimoSaemann/caffe-segnet-cudnn5/blob/master/src/caffe/layers/softmax_loss_layer.cu":

template global void SoftmaxLossForwardGPU(const int nthreads, const Dtype prob_data, const Dtype label, Dtype loss, const int num, const int dim, const int spatial_dim, const bool has_ignorelabel, const int ignorelabel, Dtype counts) { CUDA_KERNEL_LOOP(index, nthreads) { const int n = index / spatial_dim; const int s = index % spatial_dim; const int label_value = static_cast(label[n spatial_dim + s]); if (has_ignorelabel && label_value == ignorelabel) { loss[index] = 0; counts[index] = 0; } else { loss[index] = -log(max(prob_data[n dim + label_value * spatial_dim + s], Dtype(FLT_MIN))); counts[index] = 1; } }

Here is the corresponding code from the former segnet source file "https://github.com/alexgkendall/caffe-segnet/blob/segnet-cleaned/src/caffe/layers/softmax_loss_layer.cu":

template global void SoftmaxLossForwardGPU(const int nthreads, const Dtype prob_data, const Dtype label, const bool weight_by_label_freqs, const float label_counts, Dtype loss, const int num, const int dim, const int spatial_dim, const bool has_ignorelabel, const int ignorelabel, Dtype counts) { CUDA_KERNEL_LOOP(index, nthreads) { const int n = index / spatial_dim; const int s = index % spatial_dim; const int label_value = static_cast(label[n spatial_dim + s]); if (has_ignorelabel && label_value == ignorelabel) { loss[index] = 0; counts[index] = 0; } else { loss[index] = -log(max(prob_data[n dim + label_value spatial_dim + s], Dtype(FLT_MIN))); if (weight_by_label_freqs) { loss[index] *= static_cast(label_counts[label_value]); } counts[index] = 1; } }

You could find the weight is missing in the current source code. I am not sure whether my observation is right. Thanks!

galchinsky commented 7 years ago

Thanks @zhiqiangdon , I thought I gone crazy because the same model was converging on my local machine and stopped working on AWS. When I rolled back to cudnn2 version everything is fine.

zhiqiangdon commented 7 years ago

@galchinsky , I encountered the same problem. The same model doesn't converge using the new version code while it converges for the former version code. When running the new code, I find nearly 100% accuracy is got on the class with the most pixels while other classes have almost zero accuracy. That's why I guess the weights do not work in the softmax loss layer. I tried to modify the file "softmax_loss_layer.cu" by adding the weights. But the model still doesn't converge. I am new to Caffe. Maybe my modification is incomplete. @TimoSaemann could you give some help or suggestions?

TimoSaemann commented 7 years ago

I have looked at it again and I am surprised that BVLC/Caffe(Dec. 16) still doesn't support class weighting for the softmax loss layer. I will add that feature to caffe-segnet-cudnn5 soon. I will let you know when I have done that (hopefully within this week).

JackieLeeTHU11 commented 7 years ago

@TimoSaemann Hi, I am looking forward to the version that contains class weighting. Thank you for the meaningful work!

TimoSaemann commented 7 years ago

I have added class_weighting to the softmax_with_loss_layer now.

zhiqiangdon commented 7 years ago

Thanks! It works! @TimoSaemann

JackieLeeTHU11 commented 7 years ago

@TimoSaemann @zhiqiangdon Hi, can this caffe code be used in FCN. I mean, if I build this caffe, can I realize FCN, since I want to use the softmaxwithloss layer to add class weight for the class unbalance in FCN8s?

TimoSaemann commented 7 years ago

@JackieLeeTHU11 Yes, I think so.

TimoSaemann / caffe-segnet-cudnn5

weighted softmax loss doesn't work in this version #7