aristoteleo / dynamo-release

Inclusive model of expression dynamics with conventional or metabolic labeling based scRNA-seq / multiomics, vector field reconstruction and differential geometry analyses
https://dynamo-release.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
414 stars 58 forks source link

Can't use dyn.pl.topography with PCA vector fields dim >2 #78

Closed rorymaizels closed 2 years ago

rorymaizels commented 4 years ago

Hi,

Thanks for this great tool, I've really enjoyed exploring the new functionalities (especially fate animation!). I've noticed an issue regarding trying to plot topographies of vector fields in PCA spaces of more than 2 dimensions.

If I run through the pipeline delineated in the Zebrafish tutorial with UMAP as my choice of basis, it works fine whether i specify 2 or >2 dimensions in dyn.pp.recipe_monocle() and dyn.tl.reduceDimension(). If I exclusively use PCA as my basis, and specify exactly 2 dimensions, it also works fine. However, if I try to use a PCA basis with >2 dimensions, dyn.vf.VectorField() works fine but when I try to run dyn.pl.topography() I get the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-148-3142c99fe9de> in <module>
----> 1 dyn.pl.topography(dyn_data, basis='pca', color='cell_subtype_unique', terms=['streamline'],  x=0, y=1)

~/.local/lib/python3.8/site-packages/dynamo/plot/topography.py in topography(adata, basis, x, y, color, layer, highlights, labels, values, theme, cmap, color_key, color_key_cmap, background, ncols, pointsize, figsize, show_legend, use_smoothed, xlim, ylim, t, terms, init_cells, init_states, quiver_source, fate, approx, quiver_size, quiver_length, density, linewidth, streamline_color, streamline_alpha, color_start_points, markersize, marker_cmap, save_show_or_return, save_kwargs, aggregate, show_arrowed_spines, ax, sort, frontier, s_kwargs_dict, q_kwargs_dict, **streamline_kwargs_dict)

~/.local/lib/python3.8/site-packages/dynamo/vectorfield/topography.py in topography(adata, basis, layer, X, dims, n, VecFld)

~/.local/lib/python3.8/site-packages/dynamo/vectorfield/topography.py in find_fixed_points_by_sampling(self, n, x_range, y_range, lhs, tol_redundant)

~/.local/lib/python3.8/site-packages/dynamo/vectorfield/topography.py in find_fixed_points(X0, func_vf, tol_redundant, full_output)

/camp/lab/briscoej/working/Rory/.conda/my_envs/py_lab/lib/python3.8/site-packages/scipy/optimize/minpack.py in fsolve(func, x0, args, fprime, full_output, col_deriv, xtol, maxfev, band, epsfcn, factor, diag)
    145                'diag': diag}
    146 
--> 147     res = _root_hybr(func, x0, args, jac=fprime, **options)
    148     if full_output:
    149         x = res['x']

/camp/lab/briscoej/working/Rory/.conda/my_envs/py_lab/lib/python3.8/site-packages/scipy/optimize/minpack.py in _root_hybr(func, x0, args, jac, col_deriv, xtol, maxfev, band, eps, factor, diag, **unknown_options)
    211     if not isinstance(args, tuple):
    212         args = (args,)
--> 213     shape, dtype = _check_func('fsolve', 'func', func, x0, args, n, (n,))
    214     if epsfcn is None:
    215         epsfcn = finfo(dtype).eps

/camp/lab/briscoej/working/Rory/.conda/my_envs/py_lab/lib/python3.8/site-packages/scipy/optimize/minpack.py in _check_func(checker, argname, thefunc, x0, args, numinputs, output_shape)
     24 def _check_func(checker, argname, thefunc, x0, args, numinputs,
     25                 output_shape=None):
---> 26     res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
     27     if (output_shape is not None) and (shape(res) != output_shape):
     28         if (output_shape[0] != 1):

~/.local/lib/python3.8/site-packages/dynamo/vectorfield/utils_vecCalc.py in <lambda>(x)

~/.local/lib/python3.8/site-packages/dynamo/tools/utils.py in timed(*args, **kw)

~/.local/lib/python3.8/site-packages/dynamo/vectorfield/utils_vecCalc.py in vector_field_function(x, vf_dict, dim, kernel, **kernel_kwargs)

~/.local/lib/python3.8/site-packages/dynamo/tools/utils.py in timed(*args, **kw)

~/.local/lib/python3.8/site-packages/dynamo/vectorfield/utils_vecCalc.py in con_K(x, y, beta, method, return_d)

/camp/lab/briscoej/working/Rory/.conda/my_envs/py_lab/lib/python3.8/site-packages/scipy/spatial/distance.py in cdist(XA, XB, metric, *args, **kwargs)
   2719         raise ValueError('XB must be a 2-dimensional array.')
   2720     if s[1] != sB[1]:
-> 2721         raise ValueError('XA and XB must have the same number of columns '
   2722                          '(i.e. feature dimension.)')
   2723 

ValueError: XA and XB must have the same number of columns (i.e. feature dimension.)

It's hard for me to parse exactly where things are breaking down - if I might be doing something wrong let me know!

Cheers, Rory

P.s.

If you need more details about the code I ran let me know, here's my dependency info:

package matplotlib numba numpy pandas python-igraph scikit-learn scipy seaborn setuptools statsmodels tqdm hdbscan cvxopt trimap numdifftools colorcet anndata dynamo-release loompy pynndescent umap-learn
version 3.2.0 0.48.0 1.18.1 1.0.2 0.8.0 0.22.2.post1 1.4.1 0.10.0 46.0.0.post20200311 0.11.1 4.43.0 0.8.26 1.2.4 1.4.3.dev1 0.9.39 2.0.2 0.7.3 0.95.1 3.0.6 0.4.8 0.4.3
Xiaojieqiu commented 4 years ago

Hi @rorymaizels thanks for using the tool! Quantification topography for > 2 dimensions is challenging as the nullcline/separatrix can be a surface or a manifold in high dimension space (though fixed points calculations are fine). So right now dynamo is not able to map the topology for embeddings with 2 dimensions. I guess this is the main reason leading to this error. I should make this more explicit in our documentation.

Let me know if you need any other bits of help!

rorymaizels commented 4 years ago

That makes sense! Are there any other options for visualising 2D cross-sections/projections of higher dimensional vector fields? Perhaps not involving nullcline/separatrix/fixed point calculations but only visualising something like a streamline of flow through the vector field? I'd be interested to see how basis dimensionality affects the dynamics of learnt vector fields.

Thanks again!

Xiaojieqiu commented 4 years ago

I got you. To visualize streamline for pairs of two components in different embedding, you can use something like this:

dyn.pl.streamline_plot(adata, color=['clusters'], basis='pca', x=2, y=3, show_legend='on data', show_arrowed_spines=False, cut_off_velocity=False)

When setting cut_off_velocity = False you are revealing the entire vector field but not just the domains with cells. x/y arguments can be used to set the components you want to use. Let me know whether this satisfies your needs.

rorymaizels commented 4 years ago

That looks like what I'm after, thank you for such speedy help!