ShobiStassen / VIA

trajectory inference
https://pyvia.readthedocs.io/en/latest/
MIT License
76 stars 20 forks source link

The pseudotime value is NAN #19

Closed storytc closed 2 years ago

storytc commented 2 years ago

Hi, Thank you so much for your great and helpful tool! I'd like to use pyVIA to do trajectory analysis and my raw data is 955893(obs features). After PCA, I use top 20 PCs for trajectory analysis. And my code is: ncomps=20 knn=6 v0_random_seed=4 root_user = ['M1'] true_label = adata.obs['lesion'] v0 = via.VIA(adata.obsm['X_pca'][:, 0:ncomps], true_label, jac_std_global=1.5, cluster_graph_pruning_std=0.15, dist_std_local=1, knn=knn, too_big_factor=0.3, root_user=root_user, preserve_disconnected=True, do_impute_bool=True, dataset='toy', is_coarse=True,pseudotime_threshold_TS=20, neighboring_terminal_states_threshold=1, piegraph_edgeweight_scalingfactor=1.0, piegraph_arrow_head_width=0.1, max_visual_outgoing_edges=1)

However, when I set knn=6, I got the error message: image

you may notice the strange blue circle in the figure labeled node label and the trajectory score is nan:

image

But the strange thing is, when I change the parameter knn, at some values it didn't raise the error. I have no idea about the problem, maybe the rowsum of my mat has zero value? I try to confirm the guess, but it's not: image

Any advice is helpful! And your early reply is much appreciated! Sorry for any inconvenience caused!

Best regards Arial

ShobiStassen commented 2 years ago

hi @storytc, a few ideas. did you say that if you increase the knn value the issue goes away? that means that this is caused by Via finding two disconnected trajectories. If you dont think there are multiple trajectories then you should increase you knn or try to increase the jac_std_global=1.5, cluster_graph_pruning_std=0.15 to say 0.5. In order for Via to handle multiple trajectories you either need to specify 2 root groups ['M1', 'nodelabel'], or two root indices corresponding to the index of a cell that could be a root cell in the 2 groups. You can potentially also just not specify a root and let the selection be random (but remember to set the dataset parameter as ='' you can see a bit more detail here in the readthedocs tutorials

storytc commented 2 years ago

Hi, @ShobiStassen I greatly appreciate for your prompt reply! Yes, definitely when I increase the knn the problem diappear. And I consider a small knn because my data contains few observations. Should I increase the knn? You are very nice that suggest knn between 20-30 is powerful for general data, maybe my dataset is too small to use the general range. And I fine tuned the parameters you mentioned above, the problem is solved(change cluster_graph_pruning_std to 0.5) Besides, I'm not sure if there are multiple trajectories...😭

Thank you again for your great patience and huge help! :)

Best regards Arial