weight_ids covers the value in last epoch and shuffle problem

skynbe / pseudo-attributes

Official implementation of BPA (CVPR 2022)

12 stars 2 forks source link

weight_ids covers the value in last epoch and shuffle problem #4

Open duyuwen-duen opened 6 months ago

duyuwen-duen commented 6 months ago

def get_cluster_weights(self, ids): weights_ids = super().get_cluster_weights(ids) self.weights[ids] = weights_ids return weights_ids FixedCentroids denotes the function like this. However, AvgFixedCentroids will use self.weights[ids] to compute the true weight:weightsids = self.weights[ids] torch.exp(self.exp_stepweights_ids.data).cuda(),which is already be assigned to the same value of weights_ids.Is there some problem? By the way,the dataloader set shuffle=True,so the data will be shuffled every batch,which can not use the index in cluster to locate.Is that right?

skynbe commented 5 months ago

1) self.weights[ids] denote the previous weights and (weights_ids) indicate the current weights. We adopt exponential moving average (similar to moving average) for robust weight updates.

2) We use the unique ids for each sample so that the ids are not affected by data shuffling. (Refer to here)

duyuwen-duen commented 5 months ago

Thank you for your answer! But I am confused a little. 1.In the class FixedCentroids, within the function get_cluster_weights, there is a concern that the original self.weights[ids] may be overwritten by weights_ids when the assignment self.weight[ids] = weights_ids is performed. However, when utilizing this function in the class AvgFixedCentroids, it is observed that self.weights[ids] retains its equality to weights_ids, suggesting that the exponential moving average is useless. 2.I haven't found where the ids are fixed within the code. The index is dependent on the loading order of the DataLoader, as the get_sample_index function simply returns the index directly. I'm loooking forward to your reply！Thank you very much!

skynbe commented 5 months ago

In L250, self.weight[ids] = weights_ids is assigned after calculating exponential moving average in L240. This assignment is for the calculation at next iteration.
Data shuffling changes only the order of indices, while each index is mapped to the same example consistently.

duyuwen-duen commented 5 months ago

Oh,I know the second question!Thank you very much. However, in line 237 of the function get_cluster_weights(ids), self.weight[ids] is modified. In line 183, self.weight[ids] is assigned to weights_ids before the exponential moving average is calculated at line 240.

skynbe commented 5 months ago

Thanks for the correction and seems like there are some mistakes during the cleansing process. As you mentioned, weights_ids should not be assigned before calculation. We will fix it when possible.

duyuwen-duen commented 5 months ago

Thank you for your patient and informative response!