Questions regarding Double Argsorting.

LeoXinhaoLee commented 2 years ago

Hi, thank you for the awesome code release! However, I'm a little confused about the usage of argsort() in the EpisodicDataset class.

When sampling a task in __getitem__(self, idx), you conduct the following operations:

ordered_argindices = np.argsort(indices) ordered_indices = np.sort(indices) _images = self.sample_images(ordered_indices) images = torch.stack([self.transforms(_images[i]) for i in np.argsort(ordered_argindices)]) targets = np.zeros([nclasses * k], dtype=int) targets[ordered_argindices] = self.labels[ordered_indices, ...].ravel()

It seems to me that you essentially use indices[indices.argsort()][indices.argsort().argsort()] for indexing images and labels. However, I think this generated sequence is essentially the same as indices, which is already in the correct order for one task. Thus, I'm wondering the reason for such an operation.

Thank you very much for your time and help!

prlz77 commented 2 years ago

Hi! Yes, I sort, read, and then unsort. This comes from a time I was reading from hdf5 which only accepted reading data in order. Right now it has virtually no effect and it might slow down the dataloader slightly. Would you mind fixing it and doing a pull request? Otherwise I'll do it in the following weeks.

Thanks for noticing by the way :)

LeoXinhaoLee commented 2 years ago

Thanks for dispelling my doubt! Sure, I will do it.

ServiceNow / embedding-propagation

Questions regarding Double Argsorting. #22