Jean-Baptiste-Camps / stemmatology

Stemmatological Analysis of Textual Traditions
GNU General Public License v3.0
15 stars 3 forks source link

Centrality index #36

Open Jean-Baptiste-Camps opened 6 years ago

Jean-Baptiste-Camps commented 6 years ago

For now, we use the index offered in the paper

deg(u) / e - deg(u)

I'm asking myself questions on two aspects:

  1. Is there a more classic calculation of centrality that could make sense (this is more of a long term question);
  2. more prosaically, how to avoid infinite result, when e = deg(u) ? For now, the code on this point is a bit of a hack. If the result is infinite, I normalise it to 2… We could always do, deg(u) / e (perhaps better than deg(u)/ e - deg(u) + 1), which would normalise the result on 0 … 1 ?

The current code, that can be really enhanced:

        centrality = conflictsTotal  ##Computing the centrality index as described in CC 2013
        ## We have to test first that there actual are conflicts in the database
        if (sum(conflictsTotal) > 0) {
            sumConflicts = sum(conflictsTotal)/2
            for (z in 1:nrow(centrality)) {
                # Another test, to avoid division by zero (perhaps the computation of the
                # centrality index should be adapted. Discuss this with Florian. Or, we
                # could accept to have infinite numbers... does it makes sense ? They
                # sure are superior to any centrality threshold we could choose... if()
                centrality[z, ] = centrality[z, ]/(sumConflicts - centrality[z, 
                                                                             ])  # added an option to remove infinity and to replace it with 2
                if (is.infinite(centrality[z, ])) {
                    centrality[z, ] = 2
                }
            }
        } else {
            for (z in 1:nrow(centrality)) {
                centrality[z, ] = centrality[z, ] = 0
            }
        }
Jean-Baptiste-Camps commented 6 years ago

Another potential issue with this calculation is when the only conflict is of a single VL (with alternative readings) against himself. In which case, the index will be -2…

floriancafiero commented 6 years ago

C’est vrai... Faut qu’on change. Des idées, je te t’en parle dimanche.

Florian Cafiero

Le 11 mai 2018 à 19:03, Jean-Baptiste-Camps notifications@github.com a écrit :

Another potential issue with this calculation is when the only conflict is of a single VL (with alternative readings) against himself. In which case, the index will be -2…

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or mute the thread.

Jean-Baptiste-Camps commented 5 years ago

Ok, while adding more tests I stumbled on the same error as in may of last year. A centrality index of -2 for the (rare and very theoric) case where there is a single conflict of a VL with itself. Should we change computation for this ? Or renounce altogether the use of 'alternateReadings' that are causing inconsistencies everywhere, and are very hard to manage ?

conflicts_single

@floriancafiero , any thoughts ?

Jean-Baptiste-Camps commented 5 years ago

Okay, so, basically, the most easy solution I think of is switching from

deg(u) / e - deg(u)

to

\frac{deg(u)}{\sum_{v \in V} deg(v)}

image

which would normalise on [0;1].

Another solution could be to look on the various centrality measures, and test to see if some could be interesting for us as well.