paulbrodersen / netgraph

Publication-quality network visualisations in python
GNU General Public License v3.0
660 stars 39 forks source link

Type checking of graph objects #53

Closed kcharlie2 closed 1 year ago

kcharlie2 commented 1 year ago

When I create a class that simply inherits a networkx Graph, the plotter won't work. For example, this works fine:

g = networkx.Graph([(1,2)])
netgraph.Graph(g)

However if I do a simple inheritance I get an error.

class NewGraph(networkx.Graph):
    pass
g = NewGraph([(1,2)])
netgraph.Graph(g)
NotImplementedError: Input graph must be one of: 

    list
    tuple
    set
    networkx.Graph
    igraph.Graph
Currently, type(graph) = <class '__main__.NewGraph'>

Instead of checking the graph objects with isinstance, it's doing a string check of the class name in _parser.py. Per the code's comments, this is to avoid importing (potentially slow or non-existent) modules just for type checking. However this prevents any possibility of extending code.

Is it possible to allow importing for type checking? What's the standard way to do this in python? I'm assume it's a try-except for the module imports and keeping track of which are loaded. Are igraph and networkx so slow to import that this can't be done?

In the meantime I need this capability badly enough that I've monkey patched my way around it.

paulbrodersen commented 1 year ago

Per the code's comments, this is to avoid importing (potentially slow or non-existent) modules just for type checking.

My main concern is not speed but the increase in the number of dependencies, as netgraph currently supports networkx, igraph, and graph-tool (available in newer versions of netgraph than you are using), and may support additional network analysis packages in the future. In particular, igraph and graph-tool require code compilation and hence can't always be installed easily.

I'm assume it's a try-except for the module imports and keeping track of which are loaded.

I think that would probably work. You wouldn't even have to keep track of anything, as imported modules do not get loaded again upon a second import call.

Do you want make the necessary changes and a pull request? Otherwise, I will come up with something early next week. If you do, do use the latest version on PyPI, please.

kcharlie2 commented 1 year ago

I can take a stab at it. I looked through pandas since they have a lot of optional imports, and they basically do a try/except on demand and save whether the module exists in a dict.

kcharlie2 commented 1 year ago

Here's what I was thinking. I extended the idea to check the other types too (lists and ndarrays). This is for an older version but I will pull the most recent if I actually make this change. Let me know what you think.

def _check_listlike(graph):
    return isinstance(graph, (list, tuple, set))

def _check_nparray(graph):
    import numpy
    return isinstance(graph, np.ndarray)

def _check_networkx(graph):
    import networkx
    return isinstance(graph, networkx.Graph)

def _check_igraph(graph):
    import igraph
    return isinstance(graph, igraph.Graph)

def _parse_nparray(graph):
        rows, columns = graph.shape
        if columns in (2, 3):
            return netgraph._parser._parse_sparse_matrix_format(graph)
        elif rows == columns:
            return netgraph._parser._parse_adjacency_matrix(graph)
        else:
            msg = "Could not interpret input graph."
            msg += "\nIf a graph is specified as a numpy array, it has to have one of the following shapes:"
            msg += "\n\t-(E, 2) or (E, 3), where E is the number of edges"
            msg += "\n\t-(V, V), where V is the number of nodes (i.e. full rank)"
            msg += f"However, the given graph had shape {graph.shape}."

_fcn_dispatch = {
                 _check_listlike : _parse_sparse_matrix_format,
                 _check_nparray  : _parse_nparray,
                 _check_networkx : _parse_networkx_graph,
                 _check_igraph   : _parse_igraph_graph,
                 }

def parse_graph(graph):
    for fcheck,fcn in _fcn_dispatch.items():
        try:
            if fcheck(graph):
                return fcn(graph)
        except ModuleNotFoundError:
            pass
    else:
        allowed = ['list', 'tuple', 'set', 'networkx.Graph', 'igraph.Graph']
        raise NotImplementedError("Input graph must be one of: {}\nCurrently, type(graph) = {}".format("\n\n\t" + "\n\t".join(allowed), type(graph)))
paulbrodersen commented 1 year ago

Sorry for the late reply, I was away for a few days.

Your proposal looks great to me. For consistency with my usual naming strategy, can we rename fcheck to simply check, fcn to parser and _fcn_dispatch to check_to_parser?

paulbrodersen commented 1 year ago

Hey, I haven't heard back in a while so I went ahead and pushed your suggested changes (with minor adaptations). I added your github details using the co-authored-by tag, so hopefully the commit will be attributed to you, too.

kcharlie2 commented 1 year ago

Thanks a bunch! I would've had to do this on a separate machine to keep it separate from my other work and didn't get around to setting up the environment.