Update graph_utils.py - Githubissues

kristiandroste commented 11 months ago

Update numpy syntax

calebgeniesse commented 11 months ago

@kristiandroste thanks for opening this! Quick question, is there any reason for using

nx.linalg.graphmatrix.adjacency_matrix(G).toarray()

instead of something like

A = nx.to_numpy_array(G)

I'm wondering if one of these is more backwards compatible than the other.

kristiandroste commented 11 months ago

@calebgeniesse I tried running the jupyter notebook tutorials with the line you suggest. The trefoil knot still runs fine, and I actually got further through the haxby fmri: When using nx.linalg.graphmatrix.adjacency_matrix(G).toarray() I was getting an error while processing the shape graph in step 3, but when I use A = nx.to_numpy_array(G) I don't hit an error until step 4.2. This new error reads

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/dyneusr/tools/networkx_utils.py:257, in get_cover_cubes(lens, graph, cover, scale, **kwargs)
    252         cover_cubes[tuple(center)] = np.r_['0,2', lower, upper]
    254 except Exception as e:
    255 
    256     # support deprecated Cover API from kmapper==1.1.6
--> 257     bins = [tuple(_) for _ in cover.define_bins(ilens)]
    258     chunk_dist = cover.chunk_dist
    259     overlap_dist = cover.overlap_dist

AttributeError: 'Cover' object has no attribute 'define_bins'

Not sure whether this would be a compatibility issue or not.

I tried downgrading kmapper to v1.1.6, which led to another couple of errors. To get around them I changed lines 78 + 79 in .../kmapper/cover.py from

limits_array[self.limits != np.float("inf")] = 0
self.limits[self.limits == np.float("inf")] = 0

to

limits_array[self.limits != float("inf")] = 0
self.limits[self.limits == float("inf")] = 0

and I commented out line 169 in .../dyneusr/mapper/wrappers.py as follows #remove_duplicate_nodes=self.remove_duplicate_nodes

Finally I'm stuck with another networkx-related error at the same line (5) in step 4.2 of the haxby notebook:

 File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/dyneusr/tools/networkx_utils.py:304, in draw_cover(ax, cover_cubes, draw_all, max_draw, **kwargs)
    301     cover_cubes = get_cover_cubes(**kwargs)
    303 # extract bins (TODO: probably a better way to do this)
--> 304 bins = np.vstack([_ for cube,_ in cover_cubes.items()])
    305 if len(bins.T) < 2:
    306     ybins = np.ravel(sorted(set(bins[:, 0])))

File <__array_function__ internals>:200, in vstack(*args, **kwargs)

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/numpy/core/shape_base.py:296, in vstack(tup, dtype, casting)
    294 if not isinstance(arrs, list):
    295     arrs = [arrs]
--> 296 return _nx.concatenate(arrs, 0, dtype=dtype, casting=casting)

File <__array_function__ internals>:200, in concatenate(*args, **kwargs)

ValueError: need at least one array to concatenate

This is more than what you asked for, but I hope it's informative 🙂

calebgeniesse commented 11 months ago

This definitely helps. Thanks for sharing these errors. My guess is that similar to the error we discussed over email, kmapper is returning an empty graph for some reason.

If possible, can you share the error when using nx.linalg.graphmatrix.adjacency_matrix(G).toarray()?

Otherwise, it seems like A = nx.to_numpy_array(G) works then?

If so, we can either (1) resolve this PR, and then open a separate PR to address the kmapper compatibility; or (2) resolve both issues here, if you are okay with waiting. I'll need to spend a bit more time investigating the kmapper issues.

Thanks again for helping with this!

kristiandroste commented 11 months ago

No worries, no rush on my end. I'm happy to help.

I agree, seems like A = nx.to_numpy_array(G) works.

Here's the full error I get at step 3 (line 43 or cell 11) when I use nx.linalg.graphmatrix.adjacency_matrix(G).toarray()

---------------------------------------------------------------------------
NetworkXError                             Traceback (most recent call last)
Cell In[11], line 43
     41 ### Fit DyNeuGraph
     42 print("\nProcessing shape graph...")
---> 43 dG = ds.DyNeuGraph(
     44     G=graph, y=y, 
     45     labels=haxby.target_names, 
     46     colors=haxby.target_colors, 
     47     cmap=haxby.cmap
     48     )
     50 # store data
     51 subject.mapped = Bunch(
     52     data=X,
     53     lens=lens,
     54     graph=graph,
     55     dG=dG, X=X, y=y
     56     )

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/dyneusr/core.py:50, in DyNeuGraph.__init__(self, **params)
     28 """ DyNeuGraph
     29 
     30 Parameters
   (...)
     47     dyneuG.visualize()
     48 """
     49 self.cache_ = dict(params)
---> 50 self.fit(**params)

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/dyneusr/core.py:106, in DyNeuGraph.fit(self, G, X, y, node_data, edge_data, G_data, **kwargs)
    104 # process graph
    105 G = tools.process_graph(G, meta=y, **kwargs)
--> 106 A, M, TCM = tools.extract_matrices(G, index=data_ids)
    108 # create graph from TCM
    109 if G_data is True:

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/dyneusr/tools/graph_utils.py:420, in extract_matrices(G, index, **kwargs)
    416     index = np.unique([
    417         __ for n,_ in G.nodes(data='members') for __ in _
    418         ])
    419 nTR = int(max(np.r_[len(index), np.ravel(index)+1]))
--> 420 A = nx.linalg.graphmatrix.adjacency_matrix(G).toarray() #A = nx.to_numpy_matrix(G).A  # node x node #A = nx.to_numpy_array(G)#
    421 M = np.zeros((nTR, A.shape[0]))    #   TR x node
    422 T = np.zeros((nTR, nTR))

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/networkx/utils/backends.py:412, in _dispatch.__call__(self, backend, *args, **kwargs)
    409 def __call__(self, /, *args, backend=None, **kwargs):
    410     if not backends:
    411         # Fast path if no backends are installed
--> 412         return self.orig_func(*args, **kwargs)
    414     # Use `backend_name` in this function instead of `backend`
    415     backend_name = backend

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/networkx/linalg/graphmatrix.py:166, in adjacency_matrix(G, nodelist, dtype, weight)
    107 @nx._dispatch(edge_attrs="weight")
    108 def adjacency_matrix(G, nodelist=None, dtype=None, weight="weight"):
    109     """Returns adjacency matrix of G.
    110 
    111     Parameters
   (...)
    164     adjacency_spectrum
    165     """
--> 166     return nx.to_scipy_sparse_array(G, nodelist=nodelist, dtype=dtype, weight=weight)

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/networkx/utils/backends.py:412, in _dispatch.__call__(self, backend, *args, **kwargs)
    409 def __call__(self, /, *args, backend=None, **kwargs):
    410     if not backends:
    411         # Fast path if no backends are installed
--> 412         return self.orig_func(*args, **kwargs)
    414     # Use `backend_name` in this function instead of `backend`
    415     backend_name = backend

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/networkx/convert_matrix.py:570, in to_scipy_sparse_array(G, nodelist, dtype, weight, format)
    567 import scipy as sp
    569 if len(G) == 0:
--> 570     raise nx.NetworkXError("Graph has no nodes or edges")
    572 if nodelist is None:
    573     nodelist = list(G)

NetworkXError: Graph has no nodes or edges

calebgeniesse commented 11 months ago

Ok looks like the same error just caught earlier, ie., an empty graph returned by kmapper. Let’s go with the second version? (i.e., A = nx.to_numpy_array(G))

We can keep this open to fix the kmapper stuff. Note, we may need to open a PR with kmapper if it’s something wrong with that code. But first, let me try and reproduce the issue.

What version of Python and NumPy?

kristiandroste commented 11 months ago

Yeah let's go with A = nx.to_numpy_array(G)

I've been running Python 3.11.5 NumPy 1.24.3

calebgeniesse commented 11 months ago

@kristiandroste which notebook gives you the errors? Where you able to run: 02_haxby_fmri_short.ipynb?

I couldn't reproduce the errors using python==3.11.15, numpy==1.24.3, and kmapper==1.2.0. But I also didn't have to make the float("inf") changes you made.

Can you try going back to kmapper==1.2.0 and running 02_haxby_fmri_short.ipynb?

kristiandroste commented 11 months ago

02_haxby_fmri_short runs properly regardless of which kmapper version I use: 1.1.6, 1.2.0, or 2.0.1

calebgeniesse commented 11 months ago

ok great, so the errors are in 02_haxby_fmri.ipynb?

Edit: Nevermind, I was able to reproduce this error. Testing now.

kristiandroste commented 11 months ago

That's correct

For the purpose of documentation, here's the full error I get at line 5 of cell 14 (after I've made the aforementioned edit from np.float to float):

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/dyneusr/tools/networkx_utils.py:249, in get_cover_cubes(lens, graph, cover, scale, **kwargs)
    248 cover_cubes = {}
--> 249 for i, center in enumerate(cover.centers_):
    250     lower = center - cover.radius_

TypeError: 'NoneType' object is not iterable

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
Cell In[14], line 5
      2 subject.y = subject.meta.target.values.copy()
      4 # draw Mapper stages (and intermediates)
----> 5 _ = ds.tools.networkx_utils.visualize_mapper_stages(
      6     data=subject, lens=lens, cover=cover, graph=graph,dG=dG,
      7     node_size=10, 
      8     )

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/dyneusr/tools/networkx_utils.py:446, in visualize_mapper_stages(data, y, lens, cover, graph, dG, **kwargs)
    443         ax.set_xlim(lens.min(), lens.max())
    445 # 2. draw cover (axes: 2)
--> 446 draw_cover(ax=axes[1], graph=graph, lens=lens2D, cover=cover)
    448 # 3. draw clusters (axes: 3)
    449 draw_networkx(G, lens=lens2D, pos="inverse", layout=None, 
    450         node_color=node_color, node_size=node_size, 
    451         edge_color=edge_color, width=edge_size, 
    452         alpha=0.5, edges=False, ax=axes[2])

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/dyneusr/tools/networkx_utils.py:301, in draw_cover(ax, cover_cubes, draw_all, max_draw, **kwargs)
    297 ylim = np.ravel(ax.get_ylim())
    300 if cover_cubes is None:
--> 301     cover_cubes = get_cover_cubes(**kwargs)
    303 # extract bins (TODO: probably a better way to do this)
    304 bins = np.vstack([_ for cube,_ in cover_cubes.items()])

File ~/anaconda3/envs/dyneusr-notebooks/lib/python3.11/site-packages/dyneusr/tools/networkx_utils.py:257, in get_cover_cubes(lens, graph, cover, scale, **kwargs)
    252         cover_cubes[tuple(center)] = np.r_['0,2', lower, upper]
    254 except Exception as e:
    255 
    256     # support deprecated Cover API from kmapper==1.1.6
--> 257     bins = [tuple(_) for _ in cover.define_bins(ilens)]
    258     chunk_dist = cover.chunk_dist
    259     overlap_dist = cover.overlap_dist

AttributeError: 'Cover' object has no attribute 'define_bins'

calebgeniesse commented 11 months ago

Thanks @kristiandroste!

I was able to reproduce the errors. Note, I was able to reduce the error to a warning by installing numpy<1.24, but in that case the graph returned by kmapper was empty. Unfortunately, this seems like a kmapper issue, and so I'll need more time to debug this. I'm not sure how long this will take, so feel free to play around with the kmapper code to make sure data is being handled and processed correctly. Otherwise, I'll work on this when I can and keep you posted here.

kristiandroste commented 11 months ago

Sounds good. Likewise, if I can make any progress with kmapper I'll post here. Thank you @calebgeniesse 🙌

calebgeniesse commented 11 months ago

So i think i found the underlying issue.

There is a special undocumented option i added awhile back to help shift the cover bins a bit, so that points near the boundaries were accounted for properly. See scale_limits=True in the following line (executed cell 11 in the notebook):

cover = optimize_cover(
    X, r=15, g=0.67, 
    scale_r=not True, 
    scale_limits=True
)

The corresponding code is defined in dyneusr/mapper/utils.py, see e.g.,

# Define optimized limits
limits = None
if scale_limits is True:
    offset = p_overlap / float(n_cubes)
    limits = [[-offset, 1+offset] for _ in range(ndim)]
    n_cubes += 2 #* ndim

Not 100% sure why this doesn't work with kmapper anymore, but I can take a look tomorrow and see if there's an obvious way to fix this. Otherwise, the easiest thing to do may be to just turn it off in the tutorial, and possibly deprecate the option altogether.

Edit: To clarify, when I use scale_limits=False, I get a graph with 2007 edges and 211 nodes, for the first subject.

calebgeniesse commented 11 months ago

For more context, when I use scale_limits=True, I get the following:

>>> cover.limits
array([[-0.04466667,  1.04466667],
       [-0.04466667,  1.04466667]])

>>> cube_centers = cover.fit(lens)
>>> cube_centers
[array([-0.01262745]),
 array([0.05145098]),
 array([0.11552941]),
 array([0.17960784]),
 array([0.24368627]),
 array([0.30776471]),
 array([0.37184314]),
 array([0.43592157]),
 array([0.5]),
 array([0.56407843]),
 array([0.62815686]),
 array([0.69223529]),
 array([0.75631373]),
 array([0.82039216]),
 array([0.88447059]),
 array([0.94854902]),
 array([1.01262745])]

>>> hyper_cubes = cover.transform(lens, cube_centers)
>>> [_.shape for _ in hyper_cubes]
[(3, 2),
 (2, 2),
 (1, 2),
 (1, 2),
 (1, 2),
 (2, 2),
 (2, 2),
 (3, 2),
 (3, 2),
 (4, 2),
 (4, 2),
 (4, 2),
 (3, 2),
 (2, 2),
 (1, 2)]

and when I use scale_limits=False:

>>> cover.limits
None

>>> cube_centers = cover.fit(lens)
>>> cube_centers
[array([-17.97072105]),
 array([-15.59127478]),
 array([-13.2118285]),
 array([-10.83238223]),
 array([-8.45293595]),
 array([-6.07348968]),
 array([-3.6940434]),
 array([-1.31459713]),
 array([1.06484915]),
 array([3.44429542]),
 array([5.82374169]),
 array([8.20318797]),
 array([10.58263424]),
 array([12.96208052]),
 array([15.34152679])]

>>> hyper_cubes = cover.transform(lens, cube_centers)
>>> [_.shape for _ in hyper_cubes]
[(26, 2),
 (40, 2),
 (48, 2),
 (47, 2),
 (47, 2),
 (46, 2),
 (50, 2),
 (58, 2),
 (60, 2),
 (56, 2),
 (49, 2),
 (47, 2),
 (51, 2),
 (50, 2),
 (32, 2)]

So it seems to me like it has to do with the scaling of these limits or the lens itself.

Thanks again for catching this. I'll take a closer look as soon as possible.

calebgeniesse commented 10 months ago

@kristiandroste it seems like others are running into the networkx issue, and so we should merge those changes in as soon as possible. See Issue 25.

I'm going to make the final change we discussed, from

nx.linalg.graphmatrix.adjacency_matrix(G).toarray()

to

A = nx.to_numpy_array(G)

After making the changes and closing this PR, I will open a new PR here, to resolve the kmapper issue, or possibly in the DyNeuSR Notebooks repository to change the argument from scale_limits=True to scale_limits=False. Then we can continue to debug that separate issue, with less urgency.

braindynamicslab / dyneusr

Update graph_utils.py #24