GiulioRossetti / cdlib

Community Discovery Library
http://cdlib.readthedocs.io
BSD 2-Clause "Simplified" License
379 stars 72 forks source link

algorithms.leiden() breaks when the input networkx graph's nodes are neither strings nor ints #243

Open shashank025 opened 6 months ago

shashank025 commented 6 months ago

Describe the bug Similar to issue #241 and with the same root cause. The existing code only handles int and string type nodes, but networkx nodes can also be other Python hashable types.

To Reproduce Steps to reproduce the behavior:

Script (test.py) to repro the issue:

import networkx as nx
import igraph as ig

from cdlib import algorithms

class Node:
  def __init__(self, id, type):
    self.id = id
    self.type = type

  def __hash__(self):
    return hash(self.id)

  def __eq__(self, other):
    return self.id == other.id

john = Node('John Travolta', 'actor')
nick = Node('Nick Cage', 'actor')
face_off = Node('Face Off', 'movie')

G = nx.Graph()
G.add_node(john)
G.add_node(nick)
G.add_edge(john, face_off, label='ACTED_IN')
G.add_edge(nick, face_off, label='ACTED_IN')

clusters = algorithms.leiden(G)

This fails with:

Traceback (most recent call last):
  File "/Users/shashankr/projects/cdlib/test.py", line 28, in <module>
    clusters = algorithms.leiden(G)
               ^^^^^^^^^^^^^^^^^^^^
  File "/Users/shashankr/projects/cdlib/cdlib/algorithms/crisp_partition.py", line 632, in leiden
    return NodeClustering(
           ^^^^^^^^^^^^^^^
  File "/Users/shashankr/projects/cdlib/cdlib/classes/node_clustering.py", line 31, in __init__
    super().__init__(communities, graph, method_name, method_parameters, overlap)
  File "/Users/shashankr/projects/cdlib/cdlib/classes/clustering.py", line 42, in __init__
    communities = self.__convert_back_to_original_nodes_names_if_needed(communities)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/shashankr/projects/cdlib/cdlib/classes/clustering.py", line 21, in __convert_back_to_original_nodes_names_if_needed
    to_return.append([int(x[1:]) for x in com])
                      ^^^^^^^^^^
ValueError: invalid literal for int() with base 10: '<__main__.Node object at 0x100cdf5c0>'

Expected behavior

Invoking the above script (python test.py) should really just work, and not fail

Screenshots NA

Additional context

I'm hoping the fix for this is similar.

GiulioRossetti commented 6 months ago

See answer to #241