Closed raulsoutelo closed 6 years ago
Sorry, I have just realized that you have no labels in an active learning setting. Would you expect this approach to work better in a Data summarization scenario?
@raulsoutelo Yes, there are no labels and our theory directly addresses that. There is a technical dilemma in the appendix which enables that by considering the distance between distributions.
About data summarization, the problem is called core-set and there are bunch of papers which shows these ideas indeed work for data summarization. We have some references in the related work section of our paper.
Hello,
If I understood correctly, a point B is associated to a cluster A if they are closer than a fixed distance δ. Since the neural network is Lipschitz continuous, the output of the neural network cannot change much for a given δ. Therefore, if the error of the point A (contained in the coreset) is assumed to be zero, the error of B will be small (bounded).
However, if points A and B are closer than δ but they have different targets, the error could be arbitrarily large, right? Would it be sensible to only cover points that have the same target?
Thanks in advance!