pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
82.34k stars 22.15k forks source link

`unique()` is not intuitively understandable at all. #130217

Open hyperkai opened 2 months ago

hyperkai commented 2 months ago

📚 The doc issue

The doc of unique() says that return_inverse=True returns the indices for where elements in the original input ended up in the returned unique list as shown below:

  • input (Tensor) – the input tensor

  • sorted (bool) – Whether to sort the unique elements in ascending order before returning as output.

  • return_inverse (bool) – Whether to also return the indices for where elements in the original input ended up in the returned unique list.

  • return_counts (bool) – Whether to also return the counts for each unique element.

  • dim (int, optional) – the dimension to operate upon. If None, the unique of the flattened input is returned. Otherwise, each of the tensors indexed by the given dimension is treated as one of the elements to apply the unique operation upon. See examples for more details. Default: None

But the indices are not returned while return_counts=True properly returns the couns for each unique element as shown below:

import torch

my_tensor = torch.tensor([[[2, 2, 0], [0, 1, 1]],
                          [[1, 3, 0], [0, 0, 2]]])
torch.unique(input=my_tensor, return_inverse=True, return_counts=True)
# (tensor([0, 1, 2, 3]),
#  tensor([[[2, 2, 0],    ←
#           [0, 1, 1]],   ←
#          [[1, 3, 0],    ←
#           [0, 0, 2]]]), ←
#  tensor([5, 3, 3, 1]))

Suggest a potential alternative/fix

torch.__version__

torch.__version__ # 2.3.0+cu121
albanD commented 2 months ago

What do you mean the indices are not returned? The second Tensor look like indices into the returned 1D Tensor.