sknetwork-team / scikit-network

Graph Algorithms
Other
606 stars 66 forks source link

erdos_renyi wrong probabilities #496

Closed Blupper closed 2 years ago

Blupper commented 2 years ago

Description

When runing the comand erdos_renyi(5, 1.0) I would expect to receive a graph with exactly 10 edges. However, runing it 1'000 times I get on average 7.44 edges. With other graph libraries this works fine. For example networkx.generators.random_graphs.erdos_renyi_graph(5, 1.0) produces the desired result. The issue does not depend on this exact combination of numbers. Further I do not expect erdos_renyi(5, 1.5) to produce any sensible result (since p > 1), but not only does this work the result differs (on average) from what you get from erdos_renyi(5, 1.0). This indicates that a probability larger than 1 makes any sense and is different than a probability of exactly 1.

What I Did

I run this comparison between networx and scikit-network. However the networkx parts are not needed to reproduce the wrong results.

from networkx.generators.random_graphs import erdos_renyi_graph
import networkx as nx
import numpy as np
from sknetwork.data import erdos_renyi

def experiment_1(n = 5, p = 1.0, i = 1000):
    nx_sum = 0
    scikit_sum = 0
    for i in range(i):
        nx_edges = np.count_nonzero(nx.to_numpy_array(erdos_renyi_graph(n, p)))//2
        scikit_edges = np.count_nonzero(erdos_renyi(n,p).toarray())//2

        nx_sum += nx_edges
        scikit_sum += scikit_edges

    print(f"nx avg = {nx_sum/i:9.4f}")
    print(f"scikit avg = {scikit_sum/i:9.4f}")

    print(f"expected avg = {(n*(n-1))//2*min(p,1)}")
QLutz commented 2 years ago

Hello,

Thanks for raising this issue. This is now fixed on the develop branch (both the density issue and raising errors when p>1) and should be available in the next release.

Blupper commented 2 years ago

Thanks a lot!