Closed erlebach closed 4 years ago
I also wish to create much smaller meta_graphs g[i] of different types to be embedded into G
I don't know what this means, but
These MetaGraphs have node labels n[i], whose values might be greater than nv(g[i]).
this is almost certainly the wrong way to go about doing it.
I would like to find a method that would allow me to preallocated the largest sparse array I expect to need for my MetaGraphs and reuse it.
MetaGraphs don't use sparse arrays. They use SimpleGraphs under the hood, which use adjacency lists.
PS: I don't have the cycles right now to evaluate that code - it's far from minimal. I might have some time in the next couple of weeks, or someone else might chime in here. Sorry - crunch time at work.
PPS: also, @elapsed
isn't sufficient here. Part of providing a MWE is being able to run real benchmarks on it.
Also, quickly:
It seems that the merging becomes increasingly expensive as I add more edges. Why is this?
It shouldn't be too much more expensive, but it should generally grow as the log of the max degree of the graph. You might also be allocating eprops
for each new edge, but I can't imagine that it's a significant time sink. Can you show me some minimal code that demonstrates clearly that the size of the graph impacts the performance of add_edge!
?
How do you check whether a vertex is within bounds.
For SimpleGraphs and MetaGraphs, this is an O(1) check: vertices(g)
is a unit range, and membership within it is two simple comparisons: 1 <= v <= range[end]
.
I would argue that a version should exist where we can assume the edge vertices are in bound, since you go through the trouble of using @inbounds.
We use @inbounds
precisely BECAUSE we've done the check and can guarantee that the subsequent index will not produce illegal memory accesses. The check is O(1) and is likely inlined to boot. It's not the source of your performance problems:
foo(g::AbstractGraph, n::Integer) = n in vertices(g)
julia> @code_native foo(g, 4)
.text
; ┌ @ REPL[11]:1 within `foo'
; │┌ @ SimpleTraits.jl:338 within `vertices' @ SimpleTraits.jl:331
; ││┌ @ interface.jl:206 within `macro expansion'
; │││┌ @ SimpleGraphsCore.jl:39 within `nv'
; ││││┌ @ SimpleGraphsCore.jl:59 within `fadj'
; │││││┌ @ REPL[11]:1 within `getproperty'
movq 8(%rdi), %rax
; ││││└└
; ││││┌ @ array.jl:221 within `length'
movq 8(%rax), %rax
; │││└└
; │││┌ @ range.jl:320 within `OneTo' @ range.jl:311
; ││││┌ @ promotion.jl:409 within `max'
movq %rax, %rcx
sarq $63, %rcx
andnq %rax, %rcx, %rax
; │└└└└
; │┌ @ range.jl:997 within `in'
; ││┌ @ int.jl:410 within `<='
testq %rsi, %rsi
setg %cl
cmpq %rsi, %rax
setge %al
; ││└
; ││┌ @ bool.jl:40 within `&'
andb %cl, %al
; │└└
retq
nopw %cs:(%rax,%rax)
; └
Thanks. But quickly, when I run small examples with my code, I do not have the large discrepancies I am showing. I fully understand that I did not provide MWE according to normal standards, but it was the best I could do at this time.
Ok, I am obviously doing something with my code that creates time anomalies, and for all I know, it has nothing to do with any of your libraries. I am new to Julia after all. But not to coding.
Here is a minimal example. See if you can run this without changing any parameters. Notice the loop 1:100. If I change this to 1:10, all is fine. Change to 1:2, all is better.
function test()
master_size = 200_000
master_graph = MetaGraph(master_size)
sz = 10000
@time for m in 1:100
println(m)
mgh = watts_strogatz(sz, 10, 0.3)
nodes = sample(1:master_size, sz, replace=false)
weight = 0.3
for (i,e) in enumerate(edges(mgh))
s, d = src(e), dst(e)
add_edge!(master_graph, nodes[s], nodes[d], :weights, weight)
end
end
println(ne(master_graph))
return master_graph, mgh
end
mg, mgh = test()
My concern here is several-fold:
1) Why are you timing a random process? There's too much variability in watts_strogatz to infer anything about the results.
2) you're timing things that don't matter: the println, the sample
3) why are you enumerating the edges when you don't do anything with i
?
4) I'll see if I can work on this tonight.
Here is a better MWE. Written before I saw your comments (I thought you had no time :-) ). Yes, I also have other work to do. :-) The test below tells you what I wish to do.
function test_2()
master_size = 200_000
master_graph = MetaGraph(master_size)
sz = 1000
mgh = watts_strogatz(sz, 10, 0.3)
weight = 0.3
nodes = zeros(Int64, sz)
for m in 1:1000
sample!(1:master_size, nodes, replace=false)
@time for (i,e) in enumerate(edges(mgh))
s, d = src(e), dst(e)
add_edge!(master_graph, nodes[s], nodes[d], :weights, weight)
end
end
println("--------------------")
println("nb edges in master_graph: ", ne(master_graph))
return master_graph, mgh
end
mg, mgh = test_2()
Look at the garbage collection. This should go extremely fast, but add_edge! slows down. Here is a sample of the times I see. Most of the Julia programmers do not like to use @time, but I do not agree with that reticence. The second time I run this test, the module is compiled, so time is reliable. What determines the point at which GC is invoked (yes, that is too low level, and I really do not care).
Here is a sample of the times:
0.008210 seconds (51.21 k allocations: 6.258 MiB)
0.007726 seconds (51.23 k allocations: 6.264 MiB)
0.007855 seconds (51.31 k allocations: 6.266 MiB)
0.007544 seconds (51.25 k allocations: 6.274 MiB)
0.008105 seconds (51.22 k allocations: 6.258 MiB)
0.008023 seconds (51.17 k allocations: 6.272 MiB)
0.007906 seconds (51.22 k allocations: 6.262 MiB)
0.008032 seconds (51.17 k allocations: 6.267 MiB)
0.007932 seconds (51.20 k allocations: 6.266 MiB)
0.008175 seconds (51.25 k allocations: 6.279 MiB)
Here is a third test, pared down some more:
function test_3()
master_graph = MetaGraph(master_size)
weight = 0.3
#@time for i in 1:sz
@time for i in 1:1000000
s, d = nodes_src[i], nodes_dst[i]
add_edge!(master_graph, s, d, :weights, weight)
end
println("--------------------")
println("nb edges in master_graph: ", ne(master_graph))
return master_graph
end
test_3()
It takes 20 sec to add a million edges. Is that reasonable to you? It takes 0.014 sec to add 10_000 edges. If the code scaled correctly, it should take 1.4 seconds to add a million edges. What happens when you run this code?
I can't run this. Where are you defining nodes_src
and nodes_dst
?
I'm not getting your results:
julia> function doit!(mg, ns, nd)
for i = 1:length(ns)
s, d = ns[i], nd[i]
add_edge!(mg, s, d, :weights, 0.3)
end
return mg
end
doit! (generic function with 1 method)
julia> @benchmark doit!(mg, ns, nd) setup=(mg = MetaGraph(200_000); ns=rand(1:200_000, 1_000_000); nd=rand(1:200_000, 1_000_000))
BenchmarkTools.Trial:
memory estimate: 1.32 GiB
allocs estimate: 10546722
--------------
minimum time: 1.838 s (0.00% GC)
median time: 2.582 s (31.07% GC)
mean time: 2.582 s (31.07% GC)
maximum time: 3.326 s (48.24% GC)
--------------
samples: 2
evals/sample: 1
I think part of your problem is that you're using global variables. Don't do that; the compiler can't guarantee that they won't change type in the middle of the loop: https://docs.julialang.org/en/v1/manual/performance-tips/index.html#Avoid-global-variables-1
My apologies
const master_size = 200_000
const sz = 1_000_000
const nodes_src = zeros(Int64, sz)
const nodes_dst = zeros(Int64, sz)
sample!(1:master_size, nodes_src, replace=true)
sample!(1:master_size, nodes_dst, replace=true)
function test_3()
master_graph = MetaGraph(master_size)
weight = 0.3
#@time for i in 1:sz
@time for i in 1:1000000
s, d = nodes_src[i], nodes_dst[i]
add_edge!(master_graph, s, d, :weights, weight)
end
println("--------------------")
println("nb edges in master_graph: ", ne(master_graph))
return master_graph
end
test_3()
My original codes were not using global variables. My latest tests were because I was continuously simplifying.
I do not understand the construction of your last benchmark. How does setup
work? You are using it to set the values of arguments. I will look it up.
I can duplicate your results on your example, written in a form I understand:
function doit!(mg, ns, nd)
for i = 1:length(ns)
s, d = ns[i], nd[i]
add_edge!(mg, s, d, :weights, 0.3)
end
return mg
end
function test_code()
mg = MetaGraph(200_000)
ns=rand(1:200_000, 1_000_000)
nd=rand(1:200_000, 1_000_000)
@time doit!(mg, ns, nd)
end
3.701726 seconds (10.55 M allocations: 1.316 GiB, 56.05% gc time)
3.021390 seconds (10.55 M allocations: 1.316 GiB, 46.68% gc time)
2.758614 seconds (10.55 M allocations: 1.316 GiB, 42.58% gc time)
You might be right about the global variables, although I thought I was paying attention. I really do not like that aspect of Julia, although I am coming to appreciate its other qualities.
Now let us answer your previous questions:
My concern here is several-fold:
Why are you timing a random process? There's too much variability in watts_strogatz to infer anything about the results. ANSW: my use case will have that random element in it. I will be creating about 10,000 small graphs, whose edges will be incorporated into the larger MetaGraph. Am I making sense? My examples were only illustrative.
you're timing things that don't matter: the println, the sample. ANSW: I agree. But that will not affect the timings. One should be able to distinguish O(epsilon) versus O(1) effects. I understand what you mean though. But it is real world situations I am interested in, not ideal cases.
why are you enumerating the edges when you don't do anything with i? I do not do anything with it because I removed the parts of the code that were not considered a part of the MWE. If you have a more complete version already. I enumerate the edges of the smaller graph and associate them with node numbers in the larger graph, after which I fill the larger graph.
I think you have helped me enough unless you have personal curiosity. I do not wish to abuse your time!
I'm happy to continue helping, though I am still frustrated by the lack of rigor in your benchmarks. Please - don't use @time
or @elapsed
- make proper benchmarks that are statistically sound and reproducible. That's what I did with doit!()
above; and yes - the setup=()
does what you think - it initializes these (local) variables prior to every call of the function, and does not count the setup in the reported times.
You are currently the beneficiary of two cancelled meetings :)
Ok. I will use @benchmark. In yourcase, it only ran twice. Is that really different than running @time twice? If a benchmark is very fast and slows down in a real world case, is the benchmark still useful? It was a rhetorical question.
Is that really different than running
@time
twice?
Yes. It's equivalent, I think, to running @time
a few times to get any caching warmed up, and then running twice, collecting the results each time while minimizing the impact of global variables and other things that @time
is not aware of.
There's a reason that BenchmarkTools exists, and for those of us who were around in the dark times before it, there's no way we'd go back. (Yes, argument from authority, but indulge me.)
(PS: I make this mistake occasionally too, but if you use @foo
in a github comment, you should enclose it in backticks, otherwise the github user whose name is "foo" gets pinged.)
Also, I know you asked this rhetorically, but I feel compelled to respond:
If a benchmark is very fast and slows down in a real world case, is the benchmark still useful?
When someone can demonstrate a real-world example of this inconsistency happening, the BenchmarkTools folks would love to hear about it.
Try your DoIt! with a million, 100_000, and 10_000 count for nd
and ns
. Do the results make sense to you? Yes, the cost might be logarithmic with respect to the total edge count. A factor 10 increase in the number of edges should correspond to an increase in execution time of about 23 (~10 log(10)). That seems right.
I would like to provide a MWE that is closer to the code I am executing but much simpler than what I provided before. But honestly, it does seem that you made your point. I do not know why I get the huge times, except that the execessive memory allocations due to the graphs I generate activate garbage collection, slowing things down.
Let me ask: I have a loop 10,000 long, within which I create a watts_strogatz graph of different sizes. I would like to collect the edges of this graph. How can I do this without incurring the memory allocations. I'd like to preallocate memory, generate smaller Strogatz graphs within this memory, collect all the edges, and then blast them into the master_graph. The less the memory allocations, the faster the code. I could do that of course by bypassing LightGraphs but I would rather not do that.
Looks linear to me:
julia> @benchmark doit!(mg, ns, nd) setup=(mg = MetaGraph(200_000); ns=rand(1:200_000, 1_000_000); nd=rand(1:200_000, 1_000_000))
BenchmarkTools.Trial:
memory estimate: 1.32 GiB
allocs estimate: 10546889
--------------
minimum time: 1.809 s (0.00% GC)
median time: 2.634 s (32.47% GC)
mean time: 2.634 s (32.47% GC)
maximum time: 3.460 s (49.45% GC)
--------------
samples: 2
evals/sample: 1
julia> @benchmark doit!(mg, ns, nd) setup=(mg = MetaGraph(200_000); ns=rand(1:200_000, 100_000); nd=rand(1:200_000, 100_000))
BenchmarkTools.Trial:
memory estimate: 134.71 MiB
allocs estimate: 1126968
--------------
minimum time: 137.399 ms (0.00% GC)
median time: 203.844 ms (38.07% GC)
mean time: 223.100 ms (42.36% GC)
maximum time: 491.767 ms (70.37% GC)
--------------
samples: 22
evals/sample: 1
julia> @benchmark doit!(mg, ns, nd) setup=(mg = MetaGraph(200_000); ns=rand(1:200_000, 10_000); nd=rand(1:200_000, 10_000))
BenchmarkTools.Trial:
memory estimate: 13.44 MiB
allocs estimate: 118975
--------------
minimum time: 9.659 ms (0.00% GC)
median time: 10.916 ms (0.00% GC)
mean time: 14.703 ms (26.59% GC)
maximum time: 122.891 ms (89.76% GC)
--------------
samples: 223
evals/sample: 1
I have a loop 10,000 long, within which I create a watts_strogatz graph of different sizes. I would like to collect the edges of this graph. How can I do this without incurring the memory allocations.
What memory allocations? You will be allocating memory when you create the graph, and whenever you add an edge to the graph. You should not allocate memory when iterating over the edges of the graph, but you will allocate memory if you collect
them.
I'd like to preallocate memory, generate smaller Strogatz graphs within this memory, collect all the edges, and then blast them into the master_graph.
This won't help as much as you think. collect
allocates. But in any case, there's no way to preallocate memory for a graph structure. Think about what you know about how SimpleGraphs are stored: that is, as a vector of vectors of neighbors. How do you preallocate those vectors unless you know exactly how many elements each has? (And if you have that, then just do it manually and pass the vector of vectors into SimpleGraphs as the fadjlist
. But this is unsupported.)
Thank you Seth. I think I have all I need to solve my problems. Unless I can come up with a simple example that duplicates what I measured during my development, I do not think we can progress further.
Your timings are similar to mine, but do not go up quite as sharply when the number of edges increases. I really appreciate the time you took to help.
Here are my timings on the DoIt!
benchmark: (the times decrease by a factor of 15 and 18, in your benchmarks, the respective factors are 13 and 14):
julia> @benchmark doit!(mg, ns, nd) setup=(mg = MetaGraph(200_000); ns=rand(1:200_000, 1_000_000); nd=rand(1:200_000, 1_000_000))
BenchmarkTools.Trial:
memory estimate: 1.32 GiB
allocs estimate: 10546944
--------------
minimum time: 2.564 s (38.65% GC)
median time: 2.587 s (39.04% GC)
mean time: 2.587 s (39.04% GC)
maximum time: 2.610 s (39.42% GC)
--------------
samples: 2
evals/sample: 1
julia> @benchmark doit!(mg, ns, nd) setup=(mg = MetaGraph(200_000); ns=rand(1:200_000, 100_000); nd=rand(1:200_000, 100_000))
BenchmarkTools.Trial:
memory estimate: 134.71 MiB
allocs estimate: 1126894
--------------
minimum time: 165.497 ms (24.21% GC)
median time: 206.858 ms (28.45% GC)
mean time: 216.699 ms (38.43% GC)
maximum time: 353.596 ms (64.76% GC)
--------------
samples: 22
evals/sample: 1
julia> @benchmark doit!(mg, ns, nd) setup=(mg = MetaGraph(200_000); ns=rand(1:200_000, 10_000); nd=rand(1:200_000, 10_000))
BenchmarkTools.Trial:
memory estimate: 13.44 MiB
allocs estimate: 118991
--------------
minimum time: 9.894 ms (0.00% GC)
median time: 11.511 ms (0.00% GC)
mean time: 13.937 ms (18.32% GC)
maximum time: 39.215 ms (68.34% GC)
--------------
samples: 195
evals/sample: 1
I saved time in the following way. Here is a code fragment, which reduces memory allocations. This loop is entered thousands of times: a SimpleGraph mgh is created and embeded into master_graph as you do in DoIt!()
. I think I now have a handle on code performance, although still do not understand the severe drop when I use the MetaGraph appraoch in the code fragment. My guess: the additional garbage collection is detrimental.
if sz > 1
#println("Metagraph")
begin
# very efficient approach (mgh is a SimpleGraph
myMerge!(master_graph, mgh, person_ids, weight)
# inefficient approach
#mg = MetaGraph(mgh, weight)
#set_prop!(mg, :nodes, person_ids)
#myMerge!(master_graph, mg, weight) # mg is a MetaGraph
end
I resolved the issue. The watts_strogatz method does not return for certain values of the parameters. I submitted a bug report. While you have an assertion error emitted when the second argument is smaller or equal than the first, that is not sufficient. It must be "sufficiently smaller", or the third argument must be "small enough". Why this problem occurs, I do not know. I have not checked the source code.
I left the explanation in the comments of the other issue. It's known behavior for iterative random graph generators.
I no longer see the ohter issue. I figured the issue must be known. But why isn’t there some kind of asert error or some other notice. Can’t the conditions for this issue be detected? After all, you do have an assert comment when the second argument is larger than the first. Is this a bug or a feature of the algorithm?
At what point would you assert? The issue is that the edges are selected randomly but they're selected also based on the state of the graph at the time. Most random graph generators that select edges iteratively do not handle creation of dense graphs in any efficient manner. But you might get lucky and pick that last remaining edge right away, or you might be unlucky at the very beginning and keep picking edges that already exist.
The alternative, to sample without replacement from a set of unused edges, would require V^2 memory and wouldn't scale for any decently-sized graph. We tend to avoid such algorithms in LG since our focus is on efficient algorithms at scale.
The Erdos-Renyi (modified) generator that we use for SimpleGraph(m, n)
does something different: if the density exceeds 0.66, it creates the complement graph with (m, 2^m - n - 1)
and then inverts it prior to returning to the user. But that's only because with E-R, the next edge doesn't depend on understanding the state of the entire system.
Ok, got it. I really appreciate all the help. My code now runs quite efficiently, now that I understand the constraints. Using watts-strogatz always has a chance of stalling, correct? There are no parameter values below which it is guaranteed (in the mathematical sense) to perform? Once you reply to this last question, I think we can close this thread.
Using watts-strogatz always has a chance of stalling, correct?
Yes. The edges are sampled from a uniform distribution without replacement, so consider the (unlikely) case where you keep picking the same edge for your first selection. The construction will never terminate.
Now consider the case where you've asked for a large dense graph, and you're almost done. The probability that you will pick an edge you've already added approaches 1, yet you still haven't gotten all the edges you need. So you have to keep picking until you've gotten enough new random edges to satisfy your parameter selection. This could also never complete in a reasonable timeframe.
In general, unless otherwise specified, assume that any random graph generator is impacted by the parameters controlling its density.
There are no parameter values below which it is guaranteed (in the mathematical sense) to perform?
Yes, there is, but it's a pathological case and I include it only for completeness: if you use a random graph generator to create a zero-edge graph, it should be guaranteed to return (immediately, in most cases).
Actually, with Watts-Strogatz it's a little different, but the difference isn't really meaningful. The core issue - that adding edges iteratively in a random process is not guaranteed to terminate - still exists.
Thanks. The explanation is very clear.i have to wonder if solutions can be found using permutations in some way. Nothing comes to mind, at least not if minimal cost is important, which it is.
Yup, that's exactly the dilemma. All the permutation functions I know of require an instantiated set, which in this case would be the edge set, which makes it memory-infeasible for large graphs.
I'm glad this is helpful. I've found that optimal graph algorithm efficiency is really tied to specific use cases; that's why we have so many different graph types with different backends (and different limitations).
In an effort to keep the issues list current and compact, I'll go ahead and close this out. @erlebach , if you have any other questions or feedback, please feel free to open up new issues as appropriate. Thanks.
Suppose I have an empty MetaGraph
G
of sizeN = 200_000
. I also wish to create much smaller meta_graphsg[i]
of different types to be embedded intoG
. TheseMetaGraphs
have node labelsn[i]
, whose values might be greater thannv(g[i])
. So these valuesn[i]
is stored as graph metadata ofg[i]
. I loop through the smaller graphs and updateG
with their weight and edge values, using the node metadata to specify edge source and destinations. Of course, eachMetaGraph
is instantiated from aSimpleGraph
, so there is associated memory allocation, which then is garbage collected.I would like to find a method that would allow me to preallocated the largest sparse array I expect to need for my
MetaGraphs
and reuse it. That would require a clearing operation for the sparse matrix, and maintaining the required size (equal to the number of nodes), which would be different than the maximum size, which is the size of the maximumMetaGraph
.Am I making any sense?
I have created a minimal working code. Perhaps I have a serious misunderstanding. Very likely.
Some notes. 1) I am using @elapsed so that I can print running times the way I want. 2) Here are my timings the second time I make the run. Notice the amount of time spent in myMerge! It seems that the merging becomes increasingly expensive as I add more edges. Why is this? How do you check whether a vertex is within bounds. I would argue that a version should exist where we can assume the edge vertices are in bound, since you go through the trouble of using @inbounds.
One approach would be to allocate a large static graph, insert the edges into the static graph and when complete, transfer the static graph to a Metagraph.
Here is what I will try: accumulate the edges into my own list, and create my own add_edges routines without the checks. I believe you could add a routine called add_edges(), which would add them in bulk. Then you only have to check whether the min and max edge vertex is within bounds as opposed to each vertex. I will let you know how it goes.