Open dawnranger opened 5 years ago
For wiki given by author, it is single_label, so what I got is micro=sample=acc . Or do you have a more complete data for wiki?
For wiki given by author, it is single_label, so what I got is micro=sample=acc . Or do you have a more complete data for wiki?
here is the document of parameter average
of sklean.metirc.f1_score:
average : string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’] This parameter is required for multiclass/multilabel targets.
- 'micro': Calculate metrics globally by counting the total true positives, false negatives and false positives.
- 'macro': Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- 'weighted': Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
- 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score).
So, I think it will get different results in a multiclass case.
@dawnranger That's good,I think you can open a pull request about the results on datasets and the codes to reproduce the results in a new folder.
@dawnranger 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score). Wiki is multiclass rather than multilabel, isn‘t it? Why there is a difference between sample and acc? In addition, for flight data in your result, micro=sample=acc.
@dawnranger 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score). Wiki is multiclass rather than multilabel, isn‘t it? Why there is a difference between sample and acc? In addition, for flight data in your result, micro=sample=acc.
I think you are right. I use shenweichen's code :
averages = ["micro", "macro", "samples", "weighted"]
results = {}
for average in averages:
results[average] = f1_score(Y, Y_, average=average)
results['acc'] = accuracy_score(Y,Y_)
and I got a warning with wiki dataset:
python3/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
As discussed in stackoverflow, the ill spliting of the train/test set might be blamed for this issue.
@dawnranger Yes. I found that the classify.py is similar to scoring.py in deepwalk which is provided by writer https://github.com/phanein/deepwalk/blob/master/example_graphs/scoring.py what I was confused is author did not provide the result and the origin of wiki. In addition, I tried data BlogCatalog(multi-lable) as the node2vec paper mentioned, and I set parameter as the paper did(d=128, r=10, l=80, k=10. training percent=50%, p=q=0.25), but I got a 0.12(MacroF1), far from the result which author provided(0.2581). So depressed...
hello, from these results, the accuracy does not seem to be high, what is the cause, is it a data problem?
Results of
node2vec
,deewalk
,line
,sdne
andstruc2vec
on all datasets. Hope this will help anyone who is interested in this project.wiki
brazil
europe
usa