GiulioRossetti / ndlib

Network Diffusion Library - (for NetworkX and iGraph)
http://ndlib.readthedocs.io/
BSD 2-Clause "Simplified" License
272 stars 80 forks source link

Using Multiple processors to speed up simulation #121

Open codepujan opened 5 years ago

codepujan commented 5 years ago

Is there any way I can speed up the diffusion simulation leveraging the multiple cores I have in my machine. I want to speed up a single experiment, for a relatively larger net. Is there any way for optimization already existing. Regards,

GiulioRossetti commented 5 years ago

Unfortunately, for the moment it is not possible to parallelize a single experiment, only pools. However, we will consider this request for a future release!

codepujan commented 5 years ago

Thank you Sir, It would be a very powerful tool of utility. Till then, what do you suggest for working with relatively larger networks? I am working with a network that has 3862396 nodes and 4953798 edges. Models like SIR, SIS are relatively on waitable time, but SEIR, SWIR take a lot of time. Any tips on them large networks.

GiulioRossetti commented 5 years ago

One thing that might help, if you are storing your graph in a networkx object, is switching to igraph. Since the latest release ndlib natively supports both graph libraries.

codepujan commented 5 years ago

Would working on the same network with igraph give me speedup over Networkx?

GiulioRossetti commented 5 years ago

For sure igraph handles better graphs of that size than networkx (especially regarding memory consumption). Networkx implementation is pure python while igraph is c-based: maybe this will help a little bit.

codepujan commented 5 years ago

Thank you. Are there any updated docs on how to run the model on IGraph ?

GiulioRossetti commented 5 years ago

Actually, no docs for the moment (we're working on it). However, you have only to load the graph in igraph instead of in networkx: ndlib (since v5.0) will work as usual without requiring any adjustment to your code.

codepujan commented 5 years ago

@GiulioRossetti Sir , I tried doing it the most simple way , as :

from igraph import * g=Graph.Erdos_Renyi(1000,0.1) import ndlib.models.epidemics.SIRModel as sir model = sir.SIRModel(g)

But I'm having the error as : error

GiulioRossetti commented 5 years ago

Sorry, the examples in the docs are a little our of synch w.r.t. the latest version. Change your code in

import ndlib.models.epidemics as ep

then use instantiate the model on the graph using

model = ep.SIRModel(g)

codepujan commented 5 years ago

I tried doing the exact same thing , but I am getting the error as : Error_Igraph

My NDLIB version is : 5.0.0 My Python-Igraph version is : 0.7.1.post6

GiulioRossetti commented 5 years ago

I'll try to replicate your issue as asap.

However, there's something strange in your screen: it seems you are running everything under python 2.7 while the ndlib package has only been released for python 3. Can you change your evironment?

codepujan commented 5 years ago

I tried doing it using python3 environment as well, facing the same issue !! python3

GiulioRossetti commented 5 years ago

Ok, I think I figured it out. When reading networks from file igraph saves node ids into a 'name' attribute so as to be able to remap node ids while compacting the adjacency matrix. Conversely, when generating synthetic graphs (as ER) it doesn't store such information.

To address your issue on the ER graph just do the following:

g = Graph.Erdos_Renyi(1000, 0.1)
g.vs["name"] = list(range(g.vcount()))

Then, everything will work fine.

codepujan commented 5 years ago

I'll try to replicate your issue as asap.

However, there's something strange in your screen: it seems you are running everything under python 2.7 while the ndlib package has only been released for python 3. Can you change your evironment?

I had been previously making use of the library using Python2 (using NetworkX ).

koujiangheng commented 2 years ago
def all_nodes_influence_sim(g, p, sim_time=50):
    result_list = []
    process_list = []
    with futures.ProcessPoolExecutor(max_workers=10) as executor:
        for node in g.nodes:
            process_list.append(executor.submit(
                __mean_single_node_infection, g, node, p, sim_time))
        for process in tqdm(process_list, desc="compute single node ic value"):
            result_list.append(process.result())
    return result_list

def __mean_single_node_infection(g, node, p, sim_time):
    result_list = []
    process_list = []
    with futures.ProcessPoolExecutor(max_workers=10) as executor:
        for i in range(sim_time):
            # process_list.append(executor.submit(
            #     __single_node_ic_sim, g, node, p))
            process_list.append(executor.submit(
                __single_node_lt_sim, g, node, p))
        for process in process_list:
            result_list.append(process.result())
    return np.mean(result_list)

def __single_node_lt_sim(g: nx.Graph, node: int, p: float):
    model = ep.ThresholdModel(g)
    if(node not in g.nodes):
        raise Exception("node are not in this graph")
    config = mc.Configuration()
    nodes = []
    nodes.append(node)
    config.add_model_initial_configuration('Infected', nodes)
    for e in g.edges():
        config.add_edge_configuration("threshold", e, p)

    model.set_initial_status(config)
    iterations = model.iteration_bunch(100)
    infected = iterations[-1]["node_count"][1]
    return infected

This is how I writing into multiprocess, the experiment I did is that I need a influence value on all nodes