Is it possible to create a tutorial of how to use the TiedNets?

jenish-cj commented 6 years ago

I am interested in using this repository, however I don't know where to start from. For example, which file should I go through to see to create coupled networks? And which ones to simulate a uniform cascading failure?

AgostinoSturaro commented 6 years ago

Yes, I still have to create a proper tutorial. This is the WIP branch, it's the most up to date https://github.com/TiedNets/TiedNets/tree/machine-learning-fixed

netw_creator is the one that creates a coupled network of any (supported) type. cascades_sim is the one that performs a single simulation of any (supported) model.

They take an ini file with the configuration options and save some output file. Most of their options are documented in this page. https://github.com/TiedNets/TiedNets/wiki/User-Guide

Check the examples and let me know if you need more.

szx321 commented 5 years ago

Hello, I saw this experiment in the paper (A Realistic Model for Failure Propagation in Interdependent Cyber-Physical Systems). I am very interested in this experiment. I want to restore this experiment, but I found the lack of GeoJSON file when I debug. Could you provide GeoJSON file? Thank you.

AgostinoSturaro commented 5 years ago

There are 2 geojson files to start with, one is for the communication network and the other is for the power network. They are only needed for simulations on real networks.

The geojson for the power network is made starting from GIS data that I cannot share publicly. I can only explain you where to get the original data and how to make the necessary corrections. Please note that the original GIS data is meant for visualization (looking at lines over a map), and requires plenty of cleaning to create a useful graph for programmatic processing. Otherwise you get spurious connections and missing ones.

Besides, you don't need the geojson if you want to reproduce the simulations on synthetic networks. I suggest starting from that. Look at the configurations to create those synthetic networks.

BTW I'm in the process of cleaning up the batch network generator interface to make it easier to use.

szx321 commented 5 years ago

Thank you. Thank you for your answer.Do you have examples of synthetic networks?Can you teach me?thank you.

AgostinoSturaro commented 5 years ago

Sure. I'm adding some tests to the network creation part. I should be done by the start of next week.

szx321 commented 5 years ago

Thank you.I will always pay attention to you.

AgostinoSturaro commented 5 years ago

I have to test my new code a bit more, but you can see it here (it's a new branch) https://github.com/TiedNets/TiedNets/tree/create_nets_conf

The first step is looking at the configurations in the configs/create_nets folder. They are taken as input by the script batch_netw_creator.py that calls netw_creator.py to create the networks and calculate the centrality metrics (if requested to do so). batch_netw_creator.py reads a configuration file in json format, while netw_creator.py reads a configuration file in ini format.

What happens is that the batch json configuration is used to create multiple ini files, each of which is the configuration for the creation of a single interdependent network instance, that is represented using 3 graphs. You only need to change the batch json configuration and pass it to batch_netw_creator.py.

After you created the networks, you can run simulations on them. See the configurations in configs/run_sims for running simulations on the created networks. You can pass the path of a batch config and a batch number to batch_sim_runner_2.py, which calls cascades_sim.py to run the simulations. Again, we have a json batch configuration is used to create multiple ini files.

Finally, there are the configuration files in configs/ml_and_plots. Those are read by ml_learner.py to run machine learning and draw plots.

Focus on networks creation first. Make sure to read the wiki pages, they are not fully updated, but the cover the basics. https://github.com/TiedNets/TiedNets/wiki/User-Guide

szx321 commented 5 years ago

Thank you very much!

silence-debug-hue commented 4 years ago

a_opts['preassigned_roles_fpath'] = '../Simulations/MN_data_new/MN_powroles{}.json'.format(instance_num)

in the "batch_netw_creator.py", as what u said, it reads a configuration file in json format. The retrieved files need to be arranged in order of 0, 1 , 2, etc. whereas the "creat_nets" file contains only files with labels like:12c,13b... How can I run this programe and select what kind of .py files?

Besides, the problem raised as "KeyError:'seeds'" Am I WRONG using a_opts['preassigned_roles_fpath'] = '../TiedNets-master/configs/creat_nets/conf_0.json'.format(instance_num) as an input in the "batch_netw_creator"?

AgostinoSturaro commented 4 years ago

I see you are trying to use configurations in ../Simulations/MN_data_new. Those are for importing existing networks that I cannot currently share. MN stands for "Minnesota", and those are power and telecom map data. We could share the cleaned up version of those maps, but only if they grant you access to the GIS data in shapefile format first. You can find the request form here https://gisdata.mn.gov/dataset/util-elec-trans

If you have your own GIS maps in shapefile format, the importer can process that too. Just import it in QGIS (or ArcGIS), export the relevant columns as a geojson file, and use our importer.

However, I suggest you start by creating synthetic networks. Make sure you have Python 2.7.x (not 3.x), NetworkX 1.11 (not 2.x). I used Anaconda 4.4.0 (64-bit) and the included libraries. https://docs.anaconda.com/anaconda/packages/old-pkg-lists/4.4.0/py27/ https://repo.continuum.io/archive/

EDIT: after having installed Anaconda and ensured the libraries. I used PyCharm, but you can use the command line. Move to the main repo folder and the run function of the module, providing the path to the configuration file. Here is an example. python -c 'import batch_netw_creator; batch_netw_creator.run("/path_to_repo/create_nets_conf/configs/create_nets/conf_12b.json")' Source: https://stackoverflow.com/questions/3987041/run-function-from-the-command-line

I'll hopefully have some free time this weekend, so post your questions here.

silence-debug-hue commented 4 years ago

Thank u, and wish u can take time to reply to my additional silly questions(smile)

szx321 commented 4 years ago

Professor, I have a question.How do you generate the ini file from the json file and how do you run the runner.py file in PyCharm?thank you！

AgostinoSturaro commented 4 years ago

First create the networks using batch_netw_creator.py, as input provide a json file from configs/create_nets, e.g. the file conf_12c.json Then use the multi_proc_runner.py, changing it to point to the batch json files in configs/run_sims, e.g. the files in the folder test_mp_12c. I mean, change the path in this line batch_conf_fpath = os.path.normpath('../Simulations/test_mp/batch_{}.json'.format(batch_no)) (or just copy-paste the batch config json files to the a test_mp folder). The last pass is the learning pass, and takes a different config. Do let me know if you managed to create the synthetic networks. I should make a tutorial, yes.

szx321 commented 4 years ago

Professor, I can create synthetic network now. Python-c instruction given by you: python -c "import batch_netw_creator; Batch_netw_creator.run (' \ TiedNets - create_nets_conf \ configs \ create_nets \ conf12b json ') ",synthetic network is established. Then generate a number of result{} folders and mlstats{}.tsv files using the multi_procruner.py and batch{}.json files. Is mlstats{}.tsv a feature of machine learning?

AgostinoSturaro commented 4 years ago

Yes, the ml_stats_{}.tsv files contain machine learning examples on the rows and features on the columns. You can use ml_result_filter.py to split them into a "training set" and a "test set". That module is a bit messy, so make sure to comment/uncomment the parameter values you need. There are other .py files that are there to filter the results, but they shouldn't be needed.

Finally, you can use the run() function of the ml_learner.py module with the configurations in configs/ml_and_plots to learn from the training set, and plot data estimating the prediction quality using the test set.

szx321 commented 4 years ago

Thank you！professor

szx321 commented 4 years ago

Professor, could you write a tutorial about the ml_and_plots configuration file?

AgostinoSturaro commented 4 years ago

I've started documenting that. Here is a draft, I'll try to complete that tomorrow and turn it into a wiki page.

The ml_learner.py module has functions to:

perform machine learning (ML) tasks on datasets
plot information contained in datasets
plot comparisons of simulation results and machine learning predictions

The module needs as input a .json configuration file and at least a .tsv file output of batch_sim_runner_2.py. Each .tsv file must have a header with the names of its columns. Each row of the .tsv contains measurements relative to a single simulation, and each column contains measurements of a single metric. You can specify the .tsv files that you want to use in the .json configuration.

You can use the ml_result_filter.py module to filter, merge and split multiple .tsv files. If you just need to ignore some columns of the .tsv, you can do so using just the .json configuration.

The .json configuration files we used for our paper are stored in the folder configs/ml_and_plots. As an example for synthetic networks, you can look at the file create_nets_conf/configs/ml_and_plots/conf_1.json. Here is the high level structure of a .json configuration, with the inner levels simplified to explain:

{
    "comments": ["comment line 0", "comment line 1", ...],
    "datasets": [{"conf for dataset": 0}, {"conf for dataset": 1}, ...],
    "model_trainings": [{"conf for training model": 0}, {"conf for training model": 1}, ...],
    "plots": [{"conf to create plot": 0}, {"conf to create plot": 1}, ...],
}

The "comments" section is optional and the module completely ignores it. It is there just for your convenience, so you can annotate what the configuration file is for. The JSON standard does not support JavaScript comments ("//" or "/ /"), so this is just a quick workaround to avoid using a separate file to describe what the .json configuration file is for.

The "datasets" section is mandatory. It is an array of maps, each map describing how to import a dataset. Each dataset configuration map has the same structure, as follows

{
  "fpath": "/full/path/to/file.tsv",
  "X_col_names": [
    "name of 1st tsv column to use as ML feature or independent variable",
    "name of 2nd tsv column to use as ML feature or independent variable",
    ...
  ],
  "y_col_name": "name of tsv column to use as ML label or dependent variable",
  "info_col_names": [
    "name of 1st column to carry around as extra info",
    "name of 2nd column to carry around as extra info",
    ...
  ]
}

"fpath" string, mandatory, it is the full path to the .tsv file that contais the dataset to import. "X_col_names" array of strings, mandatory, it contains the names of the tsv columns that you want to use as features when training machine learning models and/or as independent variables when drawing your plots. "y_col_name" string, mandatory, it is the name of the tsv column that you want to use as the label when training machine learning models and/or as the dependent variable when drawing your plots. "info_col_names" array of strings, optional, it contains the names of tsv columns that you do not want to use as features when training machine learning models, but that can be useful to group or sort your data, or as additional info when drawing your plots.

The "model_trainings" section is optional. It is an array of maps, each map describing how to train a model using machine learning techniques. Each model training configuration map has the same structure, as follows

{
  "dataset_num": number of the dataset to use as training set,
  "model": {
    "name": "name of the scikit-learn model to train",
    "kwargs": { named arguments for the model initialization function }
  },
  "steps": [
    {"name": "name of a feature selection function", "kwargs": {named arguments for the function}},
    ...,
    {"name": "name of a preprocessing function", "kwargs": {named arguments for the function}},
    ...,
    {"name": "name of model training/selection function", "kwargs": {named arguments for the function}}
  ]
}

"dataset_num" integer, mandatory, it is the number (0, 1, ...) of the dataset that you want to use as your training set; this number n corresponds to the nth a dataset imported as specified in the "dataset" section of the .json; write 0 to reference the 1st model you imported. "model" > map, mandatory, it contains the configuration for initializing the machine learning model you want to train. "model" > "name", mandatory, it is the name the a scikit-learn model to initialize for training; accepted values are:

'linearregression' for sklearn.linear_model.LinearRegression
'ridgecv' for sklearn.linear_model.RidgeCV
'lassocv' for sklearn.linear_model.LassoCV
'elasticnetcv' for sklearn.linear_model.ElasticNetCV
'decisiontreeregressor' for sklearn.tree.DecisionTreeRegressor
'mlpregressor' for sklearn.neural_network.MLPRegressor

"model" > "kwargs", map, mandatory, it is the map of named arguments (kwargs) to pass to the model initialization function.

"steps" array of maps, mandatory, it is a list of steps describing how to run chosen scikit-learn functions to perform feature selection and preprocessing of your data, and to run your model training and model selection; essentially it lets you configure your ML pipeline, so be careful with the order of your elaboration steps "steps" > 0/.../n > "name" string, mandatory, it is the name of the scikit-learn function to apply at this step in your pipeline; accepted values are:

'variancethreshold' for sklearn.feature_selection.VarianceThreshold
'polynomialfeatures' for sklearn.preprocessing.PolynomialFeatures
'standardscaler' for sklearn.preprocessing.StandardScaler
'rfe' for sklearn.feature_selection.RFE
'rfecv' for sklearn.feature_selection.RFECV
'selectfrommodel' for sklearn.feature_selection.SelectFromModel
TODO: add support for 'GridSearchCV' as the last step

"steps" > 0/.../n > "kwargs" map, mandatory, it is the map of named arguments (kwargs) to pass to the function to apply at this step.

TODO: description of plots configuration(s)

szx321 commented 4 years ago

Thank you very much！

jiea489756 commented 4 years ago

Hi, We define the power network A and the communication network B manually. Then we use the netw_creator.py to generate the json file. However, when we run the simulation of cascading failures, there are some errors as following: E:\Anaconda2\python.exe E:/TiedNets-create_nets_conf/multi_proc_runner.py Batch 6) Running simulation 0 of 20 sim group 0, value 1, instance 0, seed 0 Traceback (most recent call last): File "batch_sim_runner_2.py", line 168, in sim.run(conf_fpath, floader) # run the simulation File "E:\TiedNets-create_nets_conf\cascades_sim.py", line 804, in run attacked_nodes_a, attacked_nodes_b, floader, netw_dir)) File "E:\TiedNets-create_nets_conf\cascades_sim.py", line 480, in calc_atk_centr_stats centr_stats.update(calc_atk_centrality_stats(attacked_nodes, centr_name, 'atkd_ts_betw_c', centr_info_misc)) File "E:\TiedNets-create_nets_conf\cascades_sim.py", line 390, in calc_atk_centrality_stats rank_of_quintiles = percentile(range(0, node_cnt), [20, 40, 60, 80]).tolist() File "E:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 3707, in percentile a, q, axis, out, overwrite_input, interpolation, keepdims) File "E:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 3826, in _quantile_unchecked interpolation=interpolation) File "E:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 3405, in _ureduce r = func(a, kwargs) File "E:\Anaconda2\lib\site-packages\numpy\lib\function_base.py", line 3941, in _quantile_ureduce_func x1 = take(ap, indices_below, axis=axis) weights_below File "E:\Anaconda2\lib\site-packages\numpy\core\fromnumeric.py", line 189, in take return _wrapfunc(a, 'take', indices, axis=axis, out=out, mode=mode) File "E:\Anaconda2\lib\site-packages\numpy\core\fromnumeric.py", line 56, in _wrapfunc return getattr(obj, method)(args, kwds) IndexError: cannot do a non-empty take from an empty axes.

Process finished with exit code 0

jiea489756 commented 4 years ago

Could you help us to solve these problems? Note: We have run the simulation supported as the tests normally.

jiea489756 commented 4 years ago

The networks we used (including the power network and the communication network) are shown as network.zip

jiea489756 commented 4 years ago

Thank you very much for your help!

jiea489756 commented 4 years ago

Sorry, we make mistakes when building the networks and we have solved it.

AgostinoSturaro commented 4 years ago

Sorry, I just noticed the topic. I updated my notification email so if you have further problems I should be able to help you sooner.

szx321 commented 4 years ago

Professor, I have a question.WhyI'm doing a simulation attack in Synthetic network ,"attack_tactic": "most_intra_used_generators",the results are all 0.

AgostinoSturaro commented 4 years ago

Are you trying to run a single simulation or a batch of simulations?

If you are running a single simulation, you need to provide a .cfg file and fill the field attacks. The attack_tactic means "attack this kind of nodes first", but you need to specify the number of attacked nodes in attacks.

If you need to run a batch of simulations, you can look at the batch configuration files included in this project, such as https://github.com/TiedNets/TiedNets/blob/master/configs/run_sims/test_mp_12b/batch_6.json There, you can see "attacks": null in the "base_configs", it's empty because it changes during the execution of the batch, as specified here

    "indep_var_name": "attacks",

    "indep_var_vals": {
        "pick": "specified",
        "list_of_values": [0, 1, 3, 5, 7, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100]
    },

See this file, starting from the comments to understand batch simulations, feel free to ask for clarifications. https://github.com/TiedNets/TiedNets/blob/master/batch_sim_runner_2.py

If you still have problems, please post the configuration file you are using.

szx321 commented 4 years ago

I want to run a batch of simulation.This is the canfiguration. "instances_dir": "../Simulations/test_mp/I",

"first_instance": 0,
"last_instance": 60,

"indep_var_name": "attacks",

"indep_var_vals": {
    "pick": "specified",
    "list_of_values": [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160]
},

"seeds": {
    "pick": "range",
    "start": 0,
    "stop": 100
},

"base_configs": [{
    "paths": {
        "netw_dir": null,
        "netw_b_fname": "B.graphml",
        "netw_a_fname": "A.graphml",
        "netw_inter_fname": "Inter.graphml",
        "netw_union_fname": "UnionAB.graphml",
        "results_dir": "../Simulations/test_mp/results_0",
        "run_stats_fname": "run_0.tsv",
        "end_stats_fpath": null,
        "ml_stats_fpath": "../Simulations/test_mp/ml_stats_0.tsv"
    },

    "run_opts": {
        "attacked_netw": "A",
        "attack_tactic": "most_inter_used_distr_subs",
        "intra_support_type": "realistic",
        "inter_support_type": "realistic",
        "save_death_cause": true,
        "save_attacked_roles": true,
        "attacks": null,
        "seed": null
    },

AgostinoSturaro commented 4 years ago

That has a different attack tactic, but should be fine. If you also pass the logging_config part, you should see some info when it runs. How are you running the batch_sim_runner_2.py module? I'd like to see the command line parameters. You need to pass two parameters in the correct order

batch_no = int(sys.argv[1])  # batch number, useful for debug prints
batch_conf_fpath = sys.argv[2]  # path to the batch config file

Another potential problem is the network itself. Make sure nodes in the power network (A) have roles. If no power node has a generator role, there are no generators to attack.

There are configurations to generate interdependent networks with nodes. EDIT: There's the guide with an example here https://github.com/TiedNets/TiedNets/wiki/User-Guide#generation-package

I was organizing the network creation examples in a separate code branch, and didn't merge it to the main branch. It's available here. https://github.com/TiedNets/TiedNets/tree/create_nets_conf/configs/create_nets

szx321 commented 4 years ago

I did not change the logging_config part. And I did the simulation using the multi_proc_runner. py file. The synthetic network I generated also has generation nodes.

AgostinoSturaro commented 4 years ago

I need more info to debug this. Like, are you on Python 2.7? Do you see any errors? Can you post a set of networks (A, B, I) and the result of a simulation on it (should be a line in the tsv)?

EDIT: the multi_proc_runner. py file needs a few info of its own, it's to run more batches in parallel (ideally one for each processor). It runs multiple batch_sim_runner_2.py, one for each batch.

Try running batch_sim_runner_2.py directly for a single batch, which already does multiple simulations, one by one, on the same processor. You can create a batch with a single simulation, if you wish to start small.

szx321 commented 4 years ago

I'm using python2.7. The result of the run does have some errors. Traceback (most recent call last): File "batch_sim_runner_2.py", line 174, in sim.run(conf_fpath, floader) # run the simulation File "E:\python test\TNN\TiedNets-create_nets_conf\cascades_sim.py", line 659, in run floader, netw_dir, seed) File "E:\python test\TNN\TiedNets-create_nets_conf\cascades_sim.py", line 525, in choose_nodes_by_config if target_netw == A.graph['name']: AttributeError: 'NoneType' object has no attribute 'graph'

1.zip This is the synthetic network that I generated.

AgostinoSturaro commented 4 years ago

The networks seem fine. The error seems to indicate that they were not loaded. When you use the object A, it is None.

Check the paths in your .json batch configuration. This part in particular

{
    "instances_dir": "../Simulations/test_mp/1000_nodes_20_subnets",

    "first_instance": 0,
    "last_instance": 10,

instances_dir is the main folder containing the networks for the sim batch:

instance_0, folder containing the first set of interdependent networks e.g. "../Simulations/test_mp/1000_nodes_20_subnets/instance_0"
instance_1
...
instance_9

batch_sim_runner_2.py reads the .json and uses it to create the .ini configuration files for the single simulations. See this line paths['netw_dir'] = os.path.join(instances_dir, 'instance_{}'.format(instance_num)) # input

You can set the debugger there to check see the paths. Or check the end result in the .ini

cascades_sim.py reads the .ini config file, like this netw_dir = os.path.normpath(config.get('paths', 'netw_dir'))

You can also print the variables you want to see in the log, like this logger.info('netw_dir = {}'.format(netw_dir ))

Let me know how it goes. Post the .json batch configuration if you still have issues.

szx321 commented 4 years ago

I tried a lot, but I still get 0.

I.zip This is the .json.

AgostinoSturaro commented 4 years ago

Sorry about the delay in my answers. This is the configuration you used to create the networks, not to run the simulation batch. Try cleaning up the repo, zipping up everything and posting it here. Please specify the folder you placed it in as well. I need that info to try to re-create your situation on my env.

I am trying to create a VM to share with you, but it's not so easy. We can have a chat in the weekend, maybe a screen-sharing session. I'm on the Rome timezone.

szx321 commented 4 years ago

I.zip thank you，professor.

szx321 commented 4 years ago

What tools should I use to contact you？

AgostinoSturaro commented 4 years ago

Skype, or something that can do screen-sharing with. We need to agree on a time. What timezone are you on?

BTW, please share the logs (they are not in the zip). There should be a way to log to a file and to the output stream.

The first round of simulations is attacking no nodes. If the network is stable, there are no dead nodes (correctly). Try changing this to make 10 attacks and with fewer seeds (0 and 1).

    "indep_var_name": "attacks",

    "indep_var_vals": {
        "pick": "specified",
        "list_of_values": [10]
    },

    "seeds": {
        "pick": "range",
        "start": 0,
        "stop": 2
    },

Then look in the ml_stats file at the columns #atkd and #dead_count, which are the number of attacked nodes at the start and of dead nodes at the end.

To start things out, reduce the ranges to a manageable size, so you can debug things faster. The log(s) can help you.

AgostinoSturaro commented 4 years ago

Please update and suggest a time that is right for you. I don't know your timezone, that would help.

szx321 commented 4 years ago

Sorry about the delay in my answers. I'm in Beijing timezone. I'm always free. You can book the time.

AgostinoSturaro commented 4 years ago

Tomorrow 13:00 Rome / 19:00 Bejing ? https://www.timeanddate.com/worldclock/converter.html?iso=20200913T110000&p1=215&p2=33

Please make sure you Skype works. If you can install PyCharm to view the project and use Python 2.7, even better. My contact is agostino.sturaro

szx321 commented 4 years ago

OK thank you!

sajedehsoleimani commented 3 years ago

Hello, I've read this experiment in the paper (A Realistic Model for Failure Propagation in Interdependent Cyber-Physical Systems). I am very interested in this experiment. I'm using python2.7 and I've selected the A and B networks as the commented part of your code in batch_netw_creator.py: 'name': 'A', 'model': 'rt_nested_smallworld', 'nodes': 1000, 'subnets': 20, 'beta': 0.2, 'alpha': 0.2, 'd_0': 7, 'avg_k': 4, 'q_rw': 0.5, 'roles': 'subnet_gen_transm_distr', 'generators': 100, 'transmission_substations': 270, 'distribution_substations': 630

and

'name': 'B', 'model': 'barabasi_albert', 'm': 3, 'roles': 'relay_attached_controllers', 'controllers': 1, 'relays': 1999

By running batch_netw_creator.py to create the networks, there are some errors:

RT_nested_Smallworld network successfully created Traceback (most recent call last): File "batch_netw_creator.py", line 213, in nc.run(conf_fpath) File "C:\Users\s\TiedNets-master\netw_creator.py", line 1324, in run A.node[node]['role'] = preassigned_roles[node] KeyError: u'instances_dir'

Could you please help?

The paths are changed as belows: batch_netw_creator.py:
base_dir = os.path.normpath('C:/Users/s/TiedNets-master/test/data') a_opts['preassigned_roles_fpath'] = "C:/Users/s/TiedNets-master/test/config/batch_6.json"

multi_proc_runner.py: batch_conf_fpath = os.path.normpath('C:/Users/s/TiedNets-master/test/config/batch_6.json')

Could you help me please if It is wrong to use same path for two batch json files. Are they different?

AgostinoSturaro commented 3 years ago

Try using the create_nets_conf branch where the batch network creator configurations are in json format. https://github.com/TiedNets/TiedNets/tree/create_nets_conf

zzxx59342506 commented 3 years ago

OK thank you!

Hi，are you still focus on the project? I see you are interested in power grid, complexnet, similar with my research, can we communicate through certain media? such as email or QQ or wechat ?

AgostinoSturaro commented 3 years ago

I'm still the maintainer. If you need explanations or fixes, you can write here first. Please open a new issue if needed, and provide the details.

zzxx59342506 commented 3 years ago

Thanks! I can create synthetic network now. I see that 'batch_netw_creator.py' can generate about 10 instances , all instances seem to follow the same rules, What's the purpose?

AgostinoSturaro commented 3 years ago

They should follow similar configurations, since it takes a batch configuration. Different seeds should produce different instances. Similar, but not the same. Otherwise there's an issue in the way you are using (or writing) your configuration files.

Please use this branch, it is more updated than the master branch. https://github.com/TiedNets/TiedNets/tree/create_nets_conf

It has the configurations for batch network creation https://github.com/TiedNets/TiedNets/tree/create_nets_conf/configs/create_nets

For example, this batch configuration file has a range of seeds to use to create networks. https://github.com/TiedNets/TiedNets/blob/create_nets_conf/configs/create_nets/conf_12b.json

  "seeds": {
    "pick": "range",
    "start": 64,
    "stop": 74
  },

If you still have difficulties, please provide more details.

We could schedule a chat this weekend, but try things first, and have an environment ready. The more details you can organize here, the easier it is to figure things out.

EDIT: to run batch_netw_creator.py, you can add this line at the end of it run("path to your configuration file") or you can import the module in your own script and call its run function from another file. I think you figured it out anyway.

zzxx59342506 commented 3 years ago

Thanks! I did use the branch ‘create_nets_conf’. I imported the modules in my jupyter notebook , and created the synthetic network successfully yesterday！ And I ran the batch simulations in the folder test_mp_12b/batch_6.json successfully last night！

I noticed that the configuration file used 10 instances created by batch_netw_creator.py:

 [  "first_instance": 0,
    "last_instance": 10,]

Each simulation start with a initial dead nodes :

    "indep_var_vals": {
        "pick": "specified",
        "list_of_values": [0, 1, 3, 5, 7, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100]
    },

And each indep_var_vals run 100 times with different random numbers :

    "seeds": {
        "pick": "range",
        "start": 0,
        "stop": 100

10 multiply by 16 multiply by 100=16000! It takes 5 hours to finish the batch simulations！ So I wonder that, is the purpose of such a large number of simulation results with different instances to analysis statistical error？or generate the datasets for machine learning？

AgostinoSturaro commented 3 years ago

A single simulation is not multi-threaded, but you can run multiple simulations in parallel.

A batch is a set of similar simulations to be run one by one (in succession). You can run a single batch using batch_sim_runner_2.py.

But you can also run multiple batches using multi_proc_runner.py, that way you can parallelize. If you have n CPU cores, you can run 1 batch on each core.

Please make sure you are using Python 2.7. I still need to make sure it works on Also, if you want to reproduce the same results I got, you need the same environment. That means older libraries as well, especially NetworkX (1.11 if I remember correctly).

To plot, you need to use ml_learner.py. It does 3 things: ML learning, ML prediction and plotting. The plotting is usually to show simulation results versus predictions of different trained ML models. It takes a single (giant) configuration file. You can likely skip the ML step and just plot simulation results.

All the scripts with "plot" in their name are outdated. They were a way to plot simulation results before I added the ML part. I should likely remove them and refactor ml_learner.py to split the plotting. Anyway ml_learner.py works fine. Use that and check the configurations in https://github.com/TiedNets/TiedNets/tree/create_nets_conf/configs/ml_and_plots

About the other question, let's try to understand why we need so many simulations. Let's read the most important part of this file https://github.com/TiedNets/TiedNets/blob/create_nets_conf/configs/run_sims/test_mp_12b/batch_6.json

This means we will be running simulations on all (network) instances from 0 to 10.

    "first_instance": 0,
    "last_instance": 10,

This means we want to try how different numbers of attacks affect the chosen (network) instances. The (number of) attacks if our "independent variable". By the way, "attack" just means "initial failure". It could be a storm "attacking" the nodes. It's just to differentiate the first failures (attacks) from the later ones (cascades).

    "indep_var_name": "attacks",

This says how many attacks we want to try, each value listed (specified). This means we want to try different numbers of attacks. We are enumerating (specifying) the values of the independent variable, which are the numbers of attacks we want to try. Alternatively, we could have written a range. 0 attacks means we attack no node. If the network is stable, I expect 0 dead nodes at the end. If the network is not stable to begin with (some configurations are not), you will see some dead nodes at the end even without attacking anything. That's because the unsupported part of the network collapses by itself. That might cause a cascade or not. 10 attacks means we attack 10 nodes at the start of the simulation (not 10%). And so on.

    "indep_var_vals": {
        "pick": "specified",
        "list_of_values": [0, 1, 3, 5, 7, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100]
    },

Now, skip the seedssection for a moment. For each number of attacks, we want to try attacking different sets of nodes each time. Even on the same network, attacking a different set of 2 nodes can have different outcome. That's because if we 2 attack irrelevant nodes, we won't do much damage. If we want to do this randomly, we will need different seeds. We will try all seeds in a range from 0 to 100.

    "seeds": {
        "pick": "range",
        "start": 0,
        "stop": 100
    },

We can "attack"/"simulate the initial failure of" the power network (A), the telecom network (B), or both. This means we attack/fail nodes randomly on the power network (A).

    "base_configs": [{

        "run_opts": {
            "attacked_netw": "A",
            "attack_tactic": "random",

There are other values for "attack_tactic" too, to simulate targeted attacks or attacks using heuristics. They are there to show what a stroke of bad luck can look like. For example, if just one node fails, but it's the most important (according to a certain metric).

In this case, we have 100 simulations for each number of attacks (0, 10, ...) on the same (network) instance. That's 100 ways of picking a different set of n nodes to attack on the same network. When n=0, it's a corner case where we run the same simulation without need (no big deal). But when n=10, it's reasonable to pick 10 nodes in 100 different ways. We have to do that for 10 different networks, which is not that many.

The matter is that, as you noticed, the result of multiplying of a lot of small things together becomes big pretty easily. ML needs a lot of data and Python is not that fast, so we need patience.

The plotting is complex because you need to aggregate different numbers of attacks together, regardless of seed and network. There are also additional aggregations possible, which add layers of complexity (essentially more for loops).

BTW if you want to split a batch in 2 to parallelize simulations, remember to stitch the results back together. Look at ml_result_filter.py. Or just run your simulations by night.

TiedNets / TiedNets

Is it possible to create a tutorial of how to use the TiedNets? #3