Closed supfisher closed 4 years ago
Hi @supfisher,
Thank you for your attention. For unsupervised network embedding, there are some previous methods assume that the number of labels for test data is given when computing the F1 score (You can see references [27, 29, 37] in our paper). It won't cause information leakage since we don't use the information for model training. By the way, the other two metrics, i.e., ROC-AUC and PR-AUC, don't rely on the threshold.
Hi @supfisher,
Thank you for your attention. For unsupervised network embedding, there are some previous methods assume that the number of labels for test data is given when computing the F1 score (You can see references [27, 29, 37] in our paper). It won't cause information leakage since we don't use the information for model training. By the way, the other two metrics, i.e., ROC-AUC and PR-AUC, don't rely on the threshold.
Hi @cenyk1230, So why they need to know the number of true labels? If the test set is full of true cases, the evaluation result will always be full mark. It makes the evaluation meaningless.
Hi @v587su,
In my view, this strategy provides a relatively fair comparison between different methods because all methods output the best F1-score (where precision, recall, F1-score are equal) based on their predictions. If you set fixed threshold value (e.g., 0.5) for all methods, the distributions of their predictions may affect the results and lead to unfair comparison. If you don't believe this strategy for F1-score, you can only trust two AUC metrics.
Hi. In the evaluation function, you set the threshold as the value of the true_num-th of the sorted predicted scores list. However, can we previously know the number of true edges before making predictions? Is it an information leakage during training process? It is a confusing evaluation process. Please have a look and give an explanation. Thanks.