Closed txfs19260817 closed 4 years ago
Hi, thanks for your interest in our work!
Based on the information you provided, we cannot tell what is going on. But, I suggest we follow the below steps to figure out what happened.
Double-check the physical cores in your machine. So, we tested 56
threads because our server actually has that many physical cores although with hyper-threading. One sure thing is, your machine does not have that many cores, e.g., my MacBook pro only has 4 cores. If you assign more threads than the physical cores, there will be context switching which will hurt the performance. I think this is correlated to your observation of performance dropping from 4 threads to 56 threads
. So, if your machine has 4 cores, I think testing the thread count from 1 to 4 makes sense.
The memory is another issue. Basically, the graph with CSR representation will occupy 8(|V| + |E|) bytes. Further, the intermediate data structure will occupy several magnitudes of 4|V| bytes. So, I would suggest you to monitor the available memory (should be much smaller than 16 GB) before you test the graph and its changes during runtime.
Could you test other graphs and see if the same observation holds? We have tested many standard real-world graphs collected from the KONECT and SNAP project, which you can find from our paper. I would suggest you test from smaller graphs (e.g., less than or around 1 million nodes).
There are various flags in the code, such as debug
, verbose
, you can play with them and see what's going on inside the code during runtime.
You are welcome to post more! Good luck!
Best, Yuede
Hi, thank you for your advices. I have further tested on other graph datasets and a modified version of my graph with my 6-core laptop. Results are shown below.
#Threads | LiveJournal | Baidu | DBPedia |
---|---|---|---|
1 | 274.3ms | 149.9ms | 633.7ms |
2 | 209.3ms | 96.6ms | 422.1ms |
4 | 162.8ms | 65.4ms | 306.1ms |
8 | 144.3ms | 52.7ms | 236.5ms |
12 | 143.0ms | 53.2ms | 227.0ms |
16 | 151.3ms | 59.2ms | 245.0ms |
First, from the results obtained from LiveJournal, Baidu and DBPedia graphs, we can find that the program running with a certain number of threads can reach the fastest performance.
#Threads | Modified my-graph |
---|---|
1 | 47.3s |
2 | 26.7s |
4 | 20.0s |
8 | 11.6s |
12 | 11.9s |
16 | 15.1s |
Next, I modified my graph where the number of vertices of the largest SCC reduced from over 93% of total to 75%. At this time, the multi-threaded program outperforms the single-threaded one.
In addition, I found that the average time becomes longer and the WCC time occupies the most of total time consumption instead of the latency of FW-BW on the largest SCC. Maybe because the number of relative large SCCs are increased.
To sum up, I guess that executing with less threads can be faster if the graph has the property that the most of vertices are included by the largest SCC.
Here is a chart to illustrate the relationship between size of SCC and the number of SCC. Version 1 and 2 represent before and after modifying respectively. (The largest SCC of each version has been removed for prettifying the chart.)
I think now you have reasonable results. Since your laptop has 6-core, I think testing thread count from 1 to 6 makes sense. As for your chart, you can keep the largest SCC by using the log scale for both x- and y-axis.
Do you have any more questions? Yuede
No, I don't have any, thank you for your assistance!
Hi, thanks for your great work. I managed to use my laptop to test the program on my graph dataset, which contains more than 10 million nodes and over 90% nodes are included in a single huge SCC. Surprisingly, I found that if I decrease the
thread
parameter from the default 56 to a lower value even only 1 and remain other parameters unchanged, the average time consumption will tend to be shorter.I was wondering if you could explain this phenomenon? Thanks! My env: