teddykoker / torchsort

Fast, differentiable sorting and ranking in PyTorch
https://pypi.org/project/torchsort/
Apache License 2.0
765 stars 33 forks source link

ranking returns only 1 values for all samples #64

Closed simpsus closed 1 year ago

simpsus commented 1 year ago

Hallo,

I am having the above mentioned problem, but in debugging I noticed that I really do not understand why the input tensor has to be 2d. My target (in the pandas world) is 1d, as are the predictions. My current implementation in pytorch has [len(X),1] as the dimension for both.

When I run the code:

torchsort.soft_rank( torch.rand(len(pred), 1,device=device, dtype=pred.dtype), regularization='l2', regularization_strength=1, )

the resulting tensor is tensor([[1.], [1.], [1.], ..., [1.], [1.], [1.]], device='cuda:0')

which is logic in the sense that in every row there is one sample and so the rank is one, each time. I want to sort the column, though. Of course calling torchsort on tensor.squeeze() violates the 2d requirement, and when I run a tow example on a "true" 2d tensor

torchsort.soft_rank( torch.rand(len(pred), 2,device=device, dtype=pred.dtype), regularization='l2', regularization_strength=1, )

I would expect a result with [1,2] and [2,1] which I do not get, instead it is

tensor([[1.3748, 1.6252], [1.7719, 1.2281], [1.7958, 1.2042], ..., [1.1949, 1.8051], [1.5149, 1.4851], [1.2594, 1.7406]], device='cuda:0')

which is probably due to it being the gradient and not the real values.

So I guess my real question is, how do I soft_rank a 1d tensor that does not return all 1s because that makes my loss NaN

teddykoker commented 1 year ago

Hi @simpsus, the reasoning behind using 2d tensors is so that operations can be performed over batches, which is common in deep learning. The batch dimension in this case should be the first dimension, in your case:

torchsort.soft_rank(torch.rand(1, len(pred)))

If you have a 1d tensor, pred, you can add the batch dimension with .unsqueeze(0) and then remove it with .squeeze(0), e.g.:

preds = torch.randn(10)
torchsort.soft_rank(preds.unsqueeze(0)).squeeze(0)

It is important to note that soft_rank will not always return discrete values (e.g. [1, 2]) because of the regularization which "softens" the boundaries between discrete rankings. Decreasing the regularization_strength will give you values closer towards the true rankings, but I would recommending tuning this value for your use case.

simpsus commented 1 year ago

Thank you very much!