GuangyuWangLab2021 / cellDancer

Predict RNA velocity through deep learning
https://guangyuwanglab2021.github.io/cellDancer_website/
BSD 3-Clause "New" or "Revised" License
60 stars 11 forks source link

error of cd.pseudo_time() #27

Open datou99 opened 7 months ago

datou99 commented 7 months ago

I run cd.pseudo_time() same as that in the Case study 1:

import random dt = 0.05 t_total = {dt:int(10/dt)} n_repeats = 10 cellDancer_df_update = cd.pseudo_time(cellDancer_df=cellDancer_df, grid=(30,30), dt=dt, t_total=t_total[dt], n_repeats=n_repeats, speed_up=(100,100), n_paths = 3, plot_long_trajs=True, psrng_seeds_diffusion=[i for i in range(n_repeats)], n_jobs=8)

And the error report is :

Pseudo random number generator seeds are set to: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] Generating Trajectories: 100%|██████████| 30840/30840 [01:32<00:00, 334.26it/s]

There are 3 clusters. [0 1 2] Generating Trajectories: 100%|██████████| 7020/7020 [00:17<00:00, 410.37it/s]

IndexError Traceback (most recent call last) /tmp/ipykernel_458358/741018467.py in 15 plot_long_trajs=True, 16 psrng_seeds_diffusion=[i for i in range(n_repeats)], ---> 17 n_jobs=1)

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/celldancer/pseudo_time.py in pseudo_time(cellDancer_df, grid, dt, t_total, n_repeats, psrng_seeds_diffusion, n_jobs, speed_up, n_paths, plot_long_trajs, save, output_path) 1291 eps=v_eps, 1292 n_jobs=n_jobs, -> 1293 psrng_seeds_diffusion=psrng_seeds_diffusion) 1294 1295 print("--- %s seconds ---" % (time.time() - start_time))

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/celldancer/pseudo_time.py in compute_cell_time(cellDancer_df, embedding, cell_embedding, path_clusters, cell_fate, vel_mesh, cell_grid_idx, grid_mass, sampling_ixs, n_grids, dt, t_total, eps, n_repeats, n_jobs, psrng_seeds_diffusion) 1041 cell_fate_dict, 1042 cell_embedding, -> 1043 tau = 0.05) 1044 1045 #print("\n\nAll inter cluster cell time has been resolved.\n\n\n")

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/celldancer/pseudo_time.py in cell_time_assignment_intercluster(unresolved_cell_time, cell_fate_dict, cell_embedding, tau) 514 clusterIDs = sorted(np.unique(list(cell_fate_dict.values()))) 515 --> 516 cutoff = overlap_crit_intracluster(cell_embedding, cell_fate_dict, tau) 517 #print("Cutoff is ", cutoff) 518

~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/celldancer/pseudo_time.py in overlap_crit_intracluster(cell_embedding, cell_fate_dict, quant) 751 # drop the self distances 752 temp3 = temp2[~np.eye(temp2.shape[0], dtype=bool)] --> 753 cutoff.append((np.quantile(temp3, quant))) 754 return max(cutoff) 755

<__array_function__ internals> in quantile(*args, **kwargs) ~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/numpy/lib/function_base.py in quantile(a, q, axis, out, overwrite_input, interpolation, keepdims) 3929 raise ValueError("Quantiles must be in the range [0, 1]") 3930 return _quantile_unchecked( -> 3931 a, q, axis, out, overwrite_input, interpolation, keepdims) 3932 3933 ~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/numpy/lib/function_base.py in _quantile_unchecked(a, q, axis, out, overwrite_input, interpolation, keepdims) 3937 r, k = _ureduce(a, func=_quantile_ureduce_func, q=q, axis=axis, out=out, 3938 overwrite_input=overwrite_input, -> 3939 interpolation=interpolation) 3940 if keepdims: 3941 return r.reshape(q.shape + k) ~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/numpy/lib/function_base.py in _ureduce(a, func, **kwargs) 3513 keepdim = (1,) * a.ndim 3514 -> 3515 r = func(a, **kwargs) 3516 return r, keepdim 3517 ~/bin/tools/miniconda3/envs/cellDancer/lib/python3.7/site-packages/numpy/lib/function_base.py in _quantile_ureduce_func(***failed resolving arguments***) 4048 indices_below.ravel(), indices_above.ravel(), [-1] 4049 )), axis=0) -> 4050 n = np.isnan(ap[-1]) 4051 else: 4052 # cannot contain nan IndexError: index -1 is out of bounds for axis 0 with size 0 **Does anyone know the reason for this error report? Any help will be appreciated. Best, Mia**
thompjac24 commented 6 months ago

Hi, I also am encountering the same issue with a subset of my data (I am able to get the rest of it to run normally). Did you ever solve this issue? Thanks, Jacqui

biopzhang commented 5 months ago

Hi Mia and Jacqui,

Thanks for your interest in cellDancer!

This error usually occurs when multiple lineages exist, and there is insufficient overlap between them in the embedding space used for pseudotime calculation.

The goal in cd.pseudotime is to provide a uniform pseudotime for all the cells in the given dataset. We hypothesize that nearby cells should have about the same pseudotime. Therefore, achieving a uniform pseudotime can only be successful when there are enough overlaps between all cells in the embedding space.

If you don't expect a uniform time for all the lineages, you might want to split your cells in groups, and run cd.pseudotime for each group.

If you expect a uniform time for all the lineages, there are a few ideas to ensure overlap between them.

Let me know how it works out.

Good luck! Pengzhi