Closed zjchuyp closed 8 years ago
lines 63 in get_training_examples_multilabel.m have: % add self pairs pos_pairs(insert_idx : insert_idx + this_class_num_images-1, :) = ... repmat(image_ids', 1, 2); pos_class(insert_idx : insert_idx + this_class_num_images-1, :) = class_id;
we will have the same sample for xi and xj, so "dist_pos" in lifted_struct_similarity_softmax_layer.cpp will be zero!
Gradient: read the paper again. Implementation is correct.
Batch preparation: Even though I write the same images as a positive pair, each of them undergo the random crop operation in Caffe. Thus, their distances are non-zero.
I read the souce code lifted_struct_similarity_softmax_layer.cpp, in line 146: scaler = Dtype(2.0)_this_loss / dist_pos; // update x_i caffeaxpy(K, scaler * Dtype(1.0), blob_posdiff.cpu_data(), bout + iK);
I read the paper, the gradient of dJdf{xi} is dJdf{xi} =1/|p| J_{i,j} * (2 (f(xi) -f(xj)), but why dividing D_{i,j} in the code? thanks a lot!