Closed VeritasYin closed 2 years ago
Hi! Interesting. The issue does not happen on my end.
Python 3.8.3 (default, Jul 2 2020, 11:26:31)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from ogb.linkproppred import Evaluator
>>> import numpy as np
>>> import torch
>>> dataset = 'ogbl-citation2'
>>> evaluator = Evaluator(name=dataset)
>>> yp = np.random.randint(2, size=(100,))
>>> yn = np.random.randint(2, size=(100, 1000))
>>> print(evaluator.eval({"y_pred_pos": yp, "y_pred_neg": yn})['mrr_list'].mean())
0.48075244
>>> print(evaluator.eval({"y_pred_pos": torch.from_numpy(yp), "y_pred_neg": torch.from_numpy(yn)})['mrr_list'].mean())
tensor(0.4808)
hello, I tested again from two servers using the same code, the results still seem off
Python 3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from ogb.linkproppred import Evaluator
Using backend: pytorch
RDFLib Version: 5.0.0
>>> import numpy as np
>>> import torch
>>> dataset = 'ogbl-citation2'
>>> evaluator = Evaluator(name=dataset)
>>> yp = np.random.randint(2, size=(100,))
>>> yn = np.random.randint(2, size=(100, 1000))
>>> print(evaluator.eval({"y_pred_pos": yp, "y_pred_neg": yn})['mrr_list'].mean())
0.5306459
>>> print(evaluator.eval({"y_pred_pos": torch.from_numpy(yp), "y_pred_neg": torch.from_numpy(yn)})['mrr_list'].mean())
tensor(0.0051)
>>> torch.__version__
'1.8.0'
>>> np.__version__
'1.21.2'
Python 3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:42:07)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from ogb.linkproppred import Evaluator
WARNING:root:The OGB package is out of date. Your version is 1.3.2, while the latest version is 1.3.3.
>>> import numpy as np
>>> import torch
>>> dataset = 'ogbl-citation2'
>>> evaluator = Evaluator(name=dataset)
>>> yp = np.random.randint(2, size=(100,))
>>> yn = np.random.randint(2, size=(100, 1000))
>>> print(evaluator.eval({"y_pred_pos": yp, "y_pred_neg": yn})['mrr_list'].mean())
0.4807206
>>> print(evaluator.eval({"y_pred_pos": torch.from_numpy(yp), "y_pred_neg": torch.from_numpy(yn)})['mrr_list'].mean())
tensor(0.0038)
>>> torch.__version__
'1.8.2'
>>> np.__version__
'1.21.2'
I checked my torch version. Can you try updating it?
Python 3.8.8 (default, Apr 13 2021, 19:58:26)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import torch
>>> torch.__version__
'1.10.1+cu102'
>>> np.__version__
'1.19.5'
I updated the torch packages but nothing changes.
Python 3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from ogb.linkproppred import Evaluator
Using backend: pytorch
RDFLib Version: 5.0.0
>>> import numpy as np
>>> import torch
>>> dataset = 'ogbl-citation2'
>>> evaluator = Evaluator(name=dataset)
>>> yp = np.random.randint(2, size=(100,))
>>> yn = np.random.randint(2, size=(100, 1000))
>>> print(evaluator.eval({"y_pred_pos": yp, "y_pred_neg": yn})['mrr_list'].mean())
0.46074632
>>> print(evaluator.eval({"y_pred_pos": torch.from_numpy(yp), "y_pred_neg": torch.from_numpy(yn)})['mrr_list'].mean())
tensor(0.0038)
>>> torch.__version__
'1.10.1'
>>> np.__version__
'1.21.2'
I cannot reproduce this either via torch==1.10.2
and numpy==1.21.2
. Any chance you can debug the code on your own to find out where the change in behavior occurs?
Will close this for now. Let us know if the problem still persists.
@weihua916 @rusty1s
Hello, I found out the difference between the input array of torch and numpy.
For MRR metric at ogb/linkproppred/evaluate.py(252)_eval_mrr(), it will use different argsort for numpy array and torch tensor. However, the behaviour of torch.argsort and numpy.argsort is not consistent when the input are integers as following,
y_pred (y_pred_pos, y_pred_neg)
array([[0, 0, 1, ..., 0, 1, 1],
[1, 0, 0, ..., 0, 1, 1],
[1, 1, 1, ..., 0, 0, 1],
...,
[1, 1, 0, ..., 0, 0, 1],
[0, 0, 1, ..., 0, 1, 0],
[1, 0, 1, ..., 1, 0, 0]])
argsort_numpy
array([[ 500, 546, 547, ..., 594, 206, 0],
[ 0, 531, 532, ..., 643, 743, 330],
[ 0, 569, 570, ..., 596, 217, 500],
...,
[ 0, 549, 550, ..., 663, 433, 348],
[ 330, 345, 344, ..., 448, 422, 1000],
[ 0, 394, 803, ..., 480, 462, 1000]])
argsort_torch
tensor([[562, 545, 546, ..., 594, 205, 596],
[558, 529, 531, ..., 590, 591, 593],
[580, 564, 565, ..., 436, 651, 650],
...,
[561, 542, 543, ..., 664, 663, 662],
[409, 792, 791, ..., 449, 451, 452],
[388, 812, 380, ..., 477, 480, 481]])
I see. interesting. We generally do not advise you to assign the same score to different candidate entities...
Hello,
I was using the OGB evaluator to test on the dataset Citation2 under the metric MRR. However, I found inconsistent results given by the evaluator between numpy array and tensors when the data type is integer instead of float. I tested the following code on version 1.3.0 and 1.3.3. They both result in the same behavior as below.
The corresponding output is 0.52068675 and tensor(0.0049), respectively.