benchmarking infrastructure

Schefflera-Arboricola commented 10 months ago

issue https://github.com/networkx/nx-parallel/issues/12

here I implemented a basic benchmarking infrastructure using ASV for comparing the parallel and normal implementations of the betweenness_centrality function. I couldn’t figure out how to run it for different number of nodes and edge probabilities, we would probably have to define our own benchmarking classes for that.

The steps to run the benchmarks are in README.md (result ss) :

Please give your feedback to better it.

Thank you :)

References: ASV docs, networkx benchmarking, numpy benchmarking

Schefflera-Arboricola commented 10 months ago

I wanted some guidance on how we should decide to include more parameters. Like I was thinking to include the following params and param_names :

graph_type : ["undirected", "directed", "multigraph”]
is_weighted : [weighted, unweighted]
graph_generation : [“fast_gnp_random_graph”, “erdos_renyi_graph”, “watts_strogatz_graph”, “barabasi_albert_graph”]

But, when should we include a new parameter, when it’s very different(in terms of time) from the undirected, unweighted graph made by nx.fast_gnp_random_graph or we should include different graph_types(and other parameters) just because that algorithm can take in different graph types?

Thank you :)

[EDIT] I tried running benchmarks for graph ["directed", "undirected"] and is_weighted ["True", "False"] (I used random.random() to assign random weights to edges and G = nx.fast_gnp_random_graph(num_nodes, edge_prob, seed=42, directed=True) to create random directed graphs)there wasn't a lot of variation from the undirected, unweighted graph at least for betweenness_centrality(as seen in the following ss) :

negligible variation in directed and undirected graphs

negligible variation in weighted and unweighted graphs

all benchmarks

Schefflera-Arboricola commented 9 months ago

Replying to Some big picture questions comment

Current benchmarks directory's structure

each bench_ file corresponds to a folder/file in the networkx/algorithms directory in NetworkX
each class inside a bench_ file corresponds to every file in a folder(one class if it’s a file) in networkx/algorithms
The class name corresponds to the file name and the x in bench_x corresponds to the folder name(class name and x are same if it’s a file in networkx/algorithms)
Each time_ function corresponds to each function in the file.
In future, for other folders in networkx/networkx like generators, classes, linalg, utils etc. we can have different bench_ files for each of them having different classes corresponding to different files in each of these folders.

Advantage of the above structure:

nicely organized, easy to navigate, and easy to add more benchmarks.

should we use a class here? Does asv require it? It seems like we are really just setting up a function (sorry if I am just forgetting what asv needs here...)

I have used classes just to keep it all organised.

Is there another scientific-python oriented repo that has an established directory structure for benchmarks that we could follow?

I will see if there is a better way of organizing. And, please let me know what you think of the current way.

Thank you :)

Schefflera-Arboricola commented 9 months ago

@dschult I've made the suggested updates pls let me know this looks good to you.

Schefflera-Arboricola commented 9 months ago

@dschult thank you for the detailed and insightful review!

yeah, it's better if we have graph creation and timing functions separate. I guess I just did it to make the code more compact, thinking that the graph creation time will be included for both the algo_types(but, that won't be correct timing for the functions, now that I think about it. Thanks for pointing it out!). Also, there might be some cases where we might need different initial graphs for different functions in the same class. Then I think we can have if conditions in setup function.

About the base Benchmark class, I was referring to the scipy and the numpy libraries and they have a base class(with just a pass statement) for all benchmarks. But, in pandas, there is no common base class. So, I decided to have a common base class for all the benchmark classes just in case in the future we would like to add something to it. Please let me know your thoughts on this.

Also, I agree using backend="parallel" is better than H = nxp.ParallelGraph(G), because it does take some extra time to convert it to a ParallelGraph(not a lot, but around 0.00000405... seconds). Also, this way we won't have to have theif algo_type ==* condition and the timing_func will then become just a one-line function, so we can get rid of it.

Also, I think it will be better to make it backends instead of algo_types.

Thank you :)

Schefflera-Arboricola commented 9 months ago

Also, I think we should keep the get_graph function because then the graphs will be saved in the cache(because of @lru_cache(typed=True)) so it would take less time when we will be running all the benchmarks. Pls let me know your thoughts on this :)

dschult commented 9 months ago

I expect that we will need many different types of graphs for the algorithms. So the get_graph May not be able to be used very often. At least that is what happens with tests and pytest. We create a test class and put some graphs that could be used by lots of tests, but then they end up only being used by a couple of the tests and the rest construct their own graph. So we may want to create a different setup function for many of the timing functions. I think we’ll have to wait to see what is most useful. Keeping it as a common function is fine with me. Could we name it something like: cached_gnp_random_graph?

Does caching work well with floating point inputs like p? Nice binary values like 0.25 should work, but round-off can occur with 0.3, e.g. I guess if the roundoff is the same for each call of the function we’re ok.

In general, caching is nice for time consuming parts that either take a lot of time, or get repeated many times. We’ll see as we get more examples whether the timing is taking a long time and how much of that time is constructing the graphs. Let’s leave it in for now.

Schefflera-Arboricola commented 9 months ago

Does caching work well with floating point inputs like p? Nice binary values like 0.25 should work, but round-off can occur with 0.3, e.g. I guess if the roundoff is the same for each call of the function we’re ok.

if the combination of arguments is the same then the function is not implemented and the graph is returned from the cache directory, it is independent of the type of the argument.

dschult commented 9 months ago

Ahh.. But is 0.3 and 0.29999999999999 the same? How does the caching software determine that the values are the same? It looks like lru_cache doesn't reliably cache for floating points (due to roundoff) unless the value is created in the same way each time. I think this will work for us so long as the value from e.g. edge_prob is not recalculated a different way (e.g. by range(0.1, 1.0, 0.1))

@functools.lru_cache
def f(x):
    print(x)

f.cache_info()
# CacheInfo(hits=0, misses=0, maxsize=128, currsize=0)

f(3/10)
# 0.3

f(1/10*3)
# 0.30000000000000004                (round-off)

f.cache_info()
# CacheInfo(hits=0, misses=2, maxsize=128, currsize=2)    <-- 3/10 and 1/10*3 are cached as different results

Schefflera-Arboricola commented 8 months ago

I think it's ready. Just made a few changes in README :)

networkx / nx-parallel

benchmarking infrastructure #24

Current benchmarks directory's structure

Advantage of the above structure: