networkit / networkit

NetworKit is a growing open-source toolkit for large-scale network analysis.
https://networkit.github.io
MIT License
757 stars 224 forks source link

Triangle Counting #422

Closed csverma610 closed 4 years ago

csverma610 commented 4 years ago

Hello,

Can someone explain how to count triangles from the EdgeScore?

Thanks

hmeyerhenke commented 4 years ago

Please take a look at LocalClusteringCoefficient.cpp in module/folder centrality. The run() method counts triangles (but also does other stuff).

angriman commented 4 years ago

AFAICS there is no way to retrieve the number of triangles from Python. Some triangle counting algorithms are implemented in the C++ core (e.g., LocalClusteringCoefficient seems to do so). I think writing a Python wrapper for at least one them makes sense. @michitux do you have any suggestions about which algorithm we should use for this purpose?

michitux commented 4 years ago

LocalClusteringCoefficient in the turbo mode and TriangleEdgeScore both implement the same algorithm, which is the algorithm I would recommend. It would make sense though to refactor this code and extract the common triangle counting parts, in particular if there shall be a third place to count triangles now. There is also ChibaNishizekiTriangleEdgeScore which should rather be deprecated I think.

@csverma610 On what level do you want to count triangles? Globally? If globally is your goal, just get the scores() from TriangleEdgeScore, build a sum over the counts you get and divide them by 3, as every triangle is counted at each of its three edges. Note that you need to call indexEdges() on the graph before computing any edge score. The following code gives you the number of triangles of the graph:

import networkit as nk
nk.readGraph("input/karate.graph", nk.Format.METIS)
G.indexEdges()
triangles = nk.sparsification.TriangleEdgeScore(G).run()
print(sum(triangles.scores())/3)

In this example, the output should be 45.0.

csverma610 commented 4 years ago

Hello Michael,

Thank you very much. I do not understand why sparsification is need? In addition, what indexEdges will do, isn't it is given by the reader or part of software internal?

Regards

On Tue, Oct 8, 2019 at 4:58 AM Michael Hamann notifications@github.com wrote:

LocalClusteringCoefficient in the turbo mode and TriangleEdgeScore both implement the same algorithm, which is the algorithm I would recommend. It would make sense though to refactor this code and extract the common triangle counting parts, in particular if there shall be a third place to count triangles now. There is also ChibaNishizekiTriangleEdgeScore which should rather be deprecated I think.

@csverma610 https://github.com/csverma610 On what level do you want to count triangles? Globally? If globally is your goal, just get the scores() from TriangleEdgeScore, build a sum over the counts you get and divide them by 3, as every triangle is counted at each of its three edges. Note that you need to call indexEdges() on the graph before computing any edge score. The following code gives you the number of triangles of the graph:

import networkit as nk nk.readGraph("input/karate.graph", nk.Format.METIS) G.indexEdges() nk.sparsification.TriangleEdgeScore(G).run()print(sum(triangles.scores())/3)

In this example, the output should be 45.0.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/networkit/networkit/issues/422?email_source=notifications&email_token=AA2TTM4KEJBY4PFB4TVBSG3QNRYWTA5CNFSM4I57RAE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAT42ZA#issuecomment-539479396, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2TTM2AZFPUDZQSGGJORCTQNRYWTANCNFSM4I57RAEQ .

csverma610 commented 4 years ago

Hello Michael,

I did some experiments on Triangle Counting. I took some triangular mesh (STL/OFF) models and extracted all their edges and stored into mesh.edgelist file. We know the exact number of triangles from the mesh.

I used TriangleEdgeScore method (without sparsification). and for some models, I do not get the correct number of triangles (usually more than expected). I verified the results with brute force calculation as

. = sum ( A(i,j)A(j,k)A(k,i))

Is the method approximate?

Thanks

On Tue, Oct 8, 2019 at 7:29 AM Chaman Singh Verma csv610@gmail.com wrote:

Hello Michael,

Thank you very much. I do not understand why sparsification is need? In addition, what indexEdges will do, isn't it is given by the reader or part of software internal?

Regards

On Tue, Oct 8, 2019 at 4:58 AM Michael Hamann notifications@github.com wrote:

LocalClusteringCoefficient in the turbo mode and TriangleEdgeScore both implement the same algorithm, which is the algorithm I would recommend. It would make sense though to refactor this code and extract the common triangle counting parts, in particular if there shall be a third place to count triangles now. There is also ChibaNishizekiTriangleEdgeScore which should rather be deprecated I think.

@csverma610 https://github.com/csverma610 On what level do you want to count triangles? Globally? If globally is your goal, just get the scores() from TriangleEdgeScore, build a sum over the counts you get and divide them by 3, as every triangle is counted at each of its three edges. Note that you need to call indexEdges() on the graph before computing any edge score. The following code gives you the number of triangles of the graph:

import networkit as nk nk.readGraph("input/karate.graph", nk.Format.METIS) G.indexEdges() nk.sparsification.TriangleEdgeScore(G).run()print(sum(triangles.scores())/3)

In this example, the output should be 45.0.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/networkit/networkit/issues/422?email_source=notifications&email_token=AA2TTM4KEJBY4PFB4TVBSG3QNRYWTA5CNFSM4I57RAE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAT42ZA#issuecomment-539479396, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2TTM2AZFPUDZQSGGJORCTQNRYWTANCNFSM4I57RAEQ .

michitux commented 4 years ago

There is no sparsification needed, the triangle counting algorithm is only exported to Python as part of the sparsification module as it is used there. By default, networkit graphs do not have ids for edges, but all algorithms that store a score for every edge need it. Calling indexEdges() generates these edge ids.

Triangular meshes might have more triangles than there are triangular faces in the mesh as triangles might be subdivided. The triangle counting in NetworKit should be exact. Can you provide an example where the count is not exact?

csverma610 commented 4 years ago

Hello,

I am trying to find a simple example which I can debug to ascertain the correctness of algorithm. However, I downloaded a data from:

http://networkrepository.com/brock200_1.php

For this example, the link says there are 1.6M triangles but NetworKit gives 543700.

csv

On Wed, Oct 9, 2019 at 3:39 AM Michael Hamann notifications@github.com wrote:

There is no sparsification needed, the triangle counting algorithm is only exported to Python as part of the sparsification module as it is used there. By default, networkit graphs do not have ids for edges, but all algorithms that store a score for every edge need it. Calling indexEdges() generates these edge ids.

Triangular meshes might have more triangles than there are triangular faces in the mesh as triangles might be subdivided. The triangle counting in NetworKit should be exact. Can you provide an example where the count is not exact?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/networkit/networkit/issues/422?email_source=notifications&email_token=AA2TTMZSNOZ72QDYN7YXUFDQNWYGLA5CNFSM4I57RAE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAXOPPQ#issuecomment-539944894, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2TTM7UCVLIY5ZDLKDY4NTQNWYGLANCNFSM4I57RAEQ .

hmeyerhenke commented 4 years ago

Sounds a bit like the normalization with the factor 3 is missing... Is the fraction between the two values exactly 3?

csverma610 commented 4 years ago

Yes, I think, Networkit results are correct. The results of the graph on the website is wrong.

The Networkit results are correct for OFF/STL files also. it was surprising that many famous models have corrupted data.

Thanks for your great software. I wish there were EigenTrust and Graph Partitioner tools also. (Sorry, if I have missed them)

On Thu, Oct 10, 2019 at 12:14 AM Henning Meyerhenke < notifications@github.com> wrote:

Sounds a bit like the normalization with the factor 3 is missing... Is the fraction between the two values exactly 3?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/networkit/networkit/issues/422?email_source=notifications&email_token=AA2TTMYHDRKWX7XPLUGKUADQN3I6VA5CNFSM4I57RAE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA3EA6Y#issuecomment-540426363, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2TTM4A2ZOWNMUPAMKPYOTQN3I6VANCNFSM4I57RAEQ .

hmeyerhenke commented 4 years ago

May I ask for which purpose you need a graph partitioner? You can also write this via email directly to me.

angriman commented 4 years ago

This issue has been resolved, I am closing it.