Open rafguns opened 9 years ago
Looking at this again. The main issue is that linkpred uses its own data structure (Scoresheet
) to track prediction scores. This has at least two advantages:
Especially 2 is fundamentally different from scikit-learn.
The way forward is probably to replace Scoresheet
with a Pandas Series, whose keys are all node pairs and whose values are scores. The index could be built prior to evaluation:
idx = pd.MultiIndex.from_tuples(itertools.combinations(G.nodes(), 2))
and shared across evaluations. The underlying numpy array could then be passed to scikit-learn metrics.
I am not yet sure how best to deal with 1 and/or to what extent it constitutes a problem.
sklearn.metrics
is in a way much simpler, using plain fuctions. Can we do something analogous or even depend on scikit-learn for stuff like ROC, recall-precision etc.?