EventideSystems / tool_for_systemic_change

GNU Affero General Public License v3.0
3 stars 0 forks source link

Convert betweenness centrality code to Ruby #999

Open ferrisoxide opened 1 week ago

ferrisoxide commented 1 week ago

Description

Convert the existing betweeness_centrality_lambda service to Ruby, to be called directly in the app proper.

This is to both reduce latency as well as allow for larger datasets.

Ideally we should aim to write the algorithm and contribute it to the Ruby equivalent of the Python NetworkX library. Note, there is already a ticket for this within the Ruby NetworkX library. See:

https://github.com/SciRuby/networkx.rb/issues/7

Proposed Solution

  1. Fork https://github.com/SciRuby/networkx.rb
  2. Implement betweenness centrality by porting the algorithm from NetworkX
  3. Create PR and present to the SciRuby team for consideration

Acceptance Criteria

ferrisoxide commented 1 week ago

DEV NOTE

The following link data:

[[1057, 1081],
 [1057, 1081],
 [1081, 1092],
 [1082, 1092],
 [1085, 1092],
 [1086, 1092],
 [1087, 1092],
 [1088, 1092],
 [1081, 1082],
 [1081, 1085],
 [1081, 1086],
 [1081, 1087],
 [1081, 1088],
 [1081, 1092],
 [1081, 1087],
 [1082, 1087],
 [1085, 1087],
 [1086, 1087],
 [1087, 1088],
 [1087, 1092],
 [1081, 1085],
 [1082, 1085],
 [1085, 1086],
 [1085, 1087],
 [1085, 1088],
 [1085, 1092],
 [1081, 1086],
 [1082, 1086],
 [1085, 1086],
 [1086, 1087],
 [1086, 1088],
 [1086, 1092],
 [1081, 1088],
 [1082, 1088],
 [1085, 1088],
 [1086, 1088],
 [1087, 1088],
 [1088, 1092],
 [1081, 1082],
 [1082, 1085],
 [1082, 1086],
 [1082, 1087],
 [1082, 1088],
 [1082, 1092],
 [883, 1071],
 [883, 1081],
 [883, 1087],
 [883, 1090],
 [883, 1097],
 [883, 1081],
 [1071, 1081],
 [1081, 1087],
 [1081, 1090],
 [1081, 1097],
 [883, 1087],
 [1071, 1087],
 [1081, 1087],
 [1087, 1090],
 [1087, 1097],
 [883, 1071],
 [1071, 1081],
 [1071, 1087],
 [1071, 1090],
 [1071, 1097],
 [883, 1090],
 [1071, 1090],
 [1081, 1090],
 [1087, 1090],
 [1090, 1097],
 [883, 1097],
 [1071, 1097],
 [1081, 1097],
 [1087, 1097],
 [1090, 1097],
 [1060, 1081],
 [1066, 1081],
 [1081, 1089],
 [1081, 1091],
 [1081, 1095],
 [1060, 1095],
 [1066, 1095],
 [1081, 1095],
 [1089, 1095],
 [1091, 1095],
 [1060, 1066],
 [1060, 1081],
 [1060, 1089],
 [1060, 1091],
 [1060, 1095],
 [1060, 1066],
 [1066, 1081],
 [1066, 1089],
 [1066, 1091],
 [1066, 1095],
 [1060, 1091],
 [1066, 1091],
 [1081, 1091],
 [1089, 1091],
 [1091, 1095],
 [1060, 1089],
 [1066, 1089],
 [1081, 1089],
 [1089, 1091],
 [1089, 1095],
 [1061, 1069],
 [1061, 1079],
 [1061, 1081],
 [1061, 1079],
 [1069, 1079],
 [1079, 1081],
 [1061, 1081],
 [1069, 1081],
 [1079, 1081],
 [1061, 1069],
 [1069, 1079],
 [1069, 1081],
 [884, 1081],
 [884, 1081],
 [1064, 1096],
 [1070, 1096],
 [1081, 1096],
 [1087, 1096],
 [1064, 1081],
 [1070, 1081],
 [1081, 1087],
 [1081, 1096],
 [1064, 1087],
 [1070, 1087],
 [1081, 1087],
 [1087, 1096],
 [1064, 1070],
 [1070, 1081],
 [1070, 1087],
 [1070, 1096],
 [1064, 1070],
 [1064, 1081],
 [1064, 1087],
 [1064, 1096],
 [1075, 1081],
 [1075, 1081],
 [897, 1081],
 [897, 1081],
 [883, 1058],
 [883, 1068],
 [883, 1078],
 [883, 1081],
 [883, 1087],
 [883, 1093],
 [883, 1094],
 [883, 1081],
 [1058, 1081],
 [1068, 1081],
 [1078, 1081],
 [1081, 1087],
 [1081, 1093],
 [1081, 1094],
 [883, 1087],
 [1058, 1087],
 [1068, 1087],
 [1078, 1087],
 [1081, 1087],
 [1087, 1093],
 [1087, 1094],
 [883, 1068],
 [1058, 1068],
 [1068, 1078],
 [1068, 1081],
 [1068, 1087],
 [1068, 1093],
 [1068, 1094],
 [883, 1093],
 [1058, 1093],
 [1068, 1093],
 [1078, 1093],
 [1081, 1093],
 [1087, 1093],
 [1093, 1094],
 [883, 1094],
 [1058, 1094],
 [1068, 1094],
 [1078, 1094],
 [1081, 1094],
 [1087, 1094],
 [1093, 1094],
 [883, 1078],
 [1058, 1078],
 [1068, 1078],
 [1078, 1081],
 [1078, 1087],
 [1078, 1093],
 [1078, 1094],
 [883, 1058],
 [1058, 1068],
 [1058, 1078],
 [1058, 1081],
 [1058, 1087],
 [1058, 1093],
 [1058, 1094],
 [1062, 1081],
 [1062, 1081],
 [883, 1081],
 [883, 1081],
 [883, 1081],
 [883, 1081],
 [1067, 1073],
 [1073, 1081],
 [1073, 1085],
 [1073, 1087],
 [1067, 1081],
 [1073, 1081],
 [1081, 1085],
 [1081, 1087],
 [1067, 1087],
 [1073, 1087],
 [1081, 1087],
 [1085, 1087],
 [1067, 1085],
 [1073, 1085],
 [1081, 1085],
 [1085, 1087],
 [1067, 1073],
 [1067, 1081],
 [1067, 1085],
 [1067, 1087],
 [1074, 1081],
 [1074, 1081],
 [883, 1081],
 [883, 1081],
 [883, 1081],
 [883, 1081],
 [1070, 1084],
 [1081, 1084],
 [1083, 1084],
 [1070, 1081],
 [1081, 1083],
 [1081, 1084],
 [1070, 1083],
 [1081, 1083],
 [1083, 1084],
 [1070, 1081],
 [1070, 1083],
 [1070, 1084],
 [883, 1059],
 [883, 1063],
 [883, 1076],
 [883, 1080],
 [883, 1081],
 [883, 1081],
 [1059, 1081],
 [1063, 1081],
 [1076, 1081],
 [1080, 1081],
 [883, 1076],
 [1059, 1076],
 [1063, 1076],
 [1076, 1080],
 [1076, 1081],
 [883, 1063],
 [1059, 1063],
 [1063, 1076],
 [1063, 1080],
 [1063, 1081],
 [883, 1059],
 [1059, 1063],
 [1059, 1076],
 [1059, 1080],
 [1059, 1081],
 [883, 1080],
 [1059, 1080],
 [1063, 1080],
 [1076, 1080],
 [1080, 1081]]

Should resolve to the following betweenness centrality values:

{"1057"=>0.0,
 "1081"=>0.7771367521367523,
 "1092"=>0.0,
 "1082"=>0.0,
 "1085"=>0.003418803418803419,
 "1086"=>0.0,
 "1087"=>0.08098290598290597,
 "1088"=>0.0,
 "883"=>0.029487179487179487,
 "1071"=>0.0,
 "1090"=>0.0,
 "1097"=>0.0,
 "1060"=>0.0,
 "1066"=>0.0,
 "1089"=>0.0,
 "1091"=>0.0,
 "1095"=>0.0,
 "1061"=>0.0,
 "1069"=>0.0,
 "1079"=>0.0,
 "884"=>0.0,
 "1064"=>0.0,
 "1096"=>0.0,
 "1070"=>0.0038461538461538464,
 "1075"=>0.0,
 "897"=>0.0,
 "1058"=>0.0,
 "1068"=>0.0,
 "1078"=>0.0,
 "1093"=>0.0,
 "1094"=>0.0,
 "1062"=>0.0,
 "1067"=>0.0,
 "1073"=>0.0,
 "1074"=>0.0,
 "1084"=>0.0,
 "1083"=>0.0,
 "1059"=>0.0,
 "1063"=>0.0,
 "1076"=>0.0,
 "1080"=>0.0}
ferrisoxide commented 1 week ago

DEV NOTE

A naive implementation, based on the RGL gem, was suggested by Github Co-pilot:


require 'rgl/adjacency'
require 'rgl/traversal'
require 'rgl/dijkstra'
require 'rgl/connected_components'

    class Graph
      def initialize
        @graph = RGL::DirectedAdjacencyGraph.new
      end

      def add_edge(source, target)
        @graph.add_edge(source, target)
      end

      def betweenness_centrality
        nodes = @graph.vertices
        centrality = Hash.new(0)

        nodes.each do |s|
          stack = []
          paths = {}
          sigma = Hash.new(0)
          delta = Hash.new(0)
          distance = Hash.new(-1)

          sigma[s] = 1
          distance[s] = 0
          queue = [s]

          while queue.any?
            v = queue.shift
            stack.push(v)
            @graph.adjacent_vertices(v).each do |w|
              if distance[w] < 0
                queue.push(w)
                distance[w] = distance[v] + 1
              end
              if distance[w] == distance[v] + 1
                sigma[w] += sigma[v]
                paths[w] ||= []
                paths[w] << v
              end
            end
          end

          while stack.any?
            w = stack.pop
            if paths[w]
              paths[w].each do |v|
                delta[v] += (sigma[v].to_f / sigma[w]) * (1 + delta[w])
              end
            end
            centrality[w] += delta[w] if w != s
          end
        end

        debugger
        rescale(centrality, @graph.num_vertices, true, directed: !@graph.directed?)
      end

      def rescale(betweenness, n, normalized, directed: false, k: nil, endpoints: false)
        scale = nil

        if normalized
          if endpoints
            if n < 2
              scale = nil  # no normalization
            else
              # Scale factor should include endpoint nodes
              scale = 1.0 / (n * (n - 1))
            end
          elsif n <= 2
            scale = nil  # no normalization b=0 for all nodes
          else
            scale = 1.0 / ((n - 1) * (n - 2))
          end
        else  # rescale by 2 for undirected graphs
          if !directed
            scale = 0.5
          else
            scale = nil
          end
        end

        if scale
          scale *= n / k if k
          betweenness.each do |v, value|
            betweenness[v] *= scale
          end
        end

        betweenness
      end
    end

      graph = Graph.new
      links.each do |(target, source)|
        graph.add_edge(target, source)
      end
      graph.betweenness_centrality

NB This does not produce correct results, but it does identify the nodes that are central