realzehuali commented 9 months ago

Thank you for this amazing tool! I was testing Basic tutorial in docs and met this error:

f,ax = plot_edge_bundle(via_object=v0, n_milestones=50, linewidth_bundle=1.5, alpha_bundle_factor=2, cmap='plasma', facecolor='white', size_scatter=15, alpha_scatter=0.2, scale_scatter_size_pop=True, extra_title_text='', headwidth_bundle=0.5, lineage_pathway = [4,7,8], text_labels=False, sc_labels=true_label) f.set_size_inches(15,4)

2023-09-14 15:20:35.211221 Computing Edges 2023-09-14 15:20:35.211722 WARNING: VIA will now autocompute an embedding. It would be better to precompute an embedding using embedding = via_umap() or via_mds() and setting this as the embedding attribute via_object = embedding. 2023-09-14 15:20:35.211722 Commencing Via-MDS 2023-09-14 15:20:35.211722 Resetting n_milestones to 1000 as n_samples > original n_milestones 2023-09-14 15:20:35.472721 Start computing with diffusion power:1 2023-09-14 15:20:35.490731 Starting MDS on milestone 2023-09-14 15:20:35.949219 End computing mds with diffusion power:1 2023-09-14 15:20:35.951221 Start finding milestones 2023-09-14 15:20:36.107220 End milestones with 50 2023-09-14 15:20:36.109223 Recompute weights 2023-09-14 15:20:36.111220 pruning milestone graph based on recomputed weights 2023-09-14 15:20:36.112220 Graph has 1 connected components before pruning 2023-09-14 15:20:36.112721 Graph has 1 connected components after pruning 2023-09-14 15:20:36.112721 Graph has 1 connected components after reconnecting 2023-09-14 15:20:36.113721 regenerate igraph on pruned edges 2023-09-14 15:20:36.121721 Setting numeric label as time_series_labels or other sequential metadata for coloring edges 2023-09-14 15:20:36.128221 Making smooth edges location of 4 is at [0] and 0

TypeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_24624\1755887396.py in 1 f,ax = plot_edge_bundle(via_object=v0, n_milestones=50, linewidth_bundle=1.5, alpha_bundle_factor=2, 2 cmap='plasma', facecolor='white', size_scatter=15, alpha_scatter=0.2, scale_scatter_size_pop=True, ----> 3 extra_title_text='', headwidth_bundle=0.5, lineage_pathway = [4,7,8], text_labels=False, sc_labels=true_label) 4 f.set_size_inches(15,4)

~\AppData\Roaming\Python\Python37\site-packages\pyVIA\plotting_via.py in plot_edge_bundle(hammerbundle_dict, via_object, alpha_bundle_factor, linewidth_bundle, facecolor, cmap, extra_title_text, size_scatter, alpha_scatter, headwidth_bundle, headwidth_alpha, arrow_frequency, show_arrow, sc_labels_sequential, sc_labels_expression, initial_bandwidth, decay, n_milestones, scale_scatter_size_pop, show_milestones, sc_labels, text_labels, lineage_pathway, dpi, fontsize_title, fontsize_labels, global_visual_pruning, use_sc_labels_sequential_for_direction, sc_scatter_size, sc_scatter_alpha) 1136 from matplotlib.patches import Rectangle 1137 sc_embedding = via_object.embedding -> 1138 max_r = np.max(via_object.embedding[:, 0]) + 1 1139 max_l = np.min(via_object.embedding[:, 0]) - 1 1140

TypeError: 'NoneType' object is not subscriptable

ShobiStassen commented 9 months ago

Hi, can I ask what is the size of the input data ?

On Thu, 14 Sept 2023, 15:25 realzehuali, @.***> wrote:

Thank you for this amazing tool! I was testing Basic tutorial in docs and met this error:

f,ax = plot_edge_bundle(via_object=v0, n_milestones=50, linewidth_bundle=1.5, alpha_bundle_factor=2, cmap='plasma', facecolor='white', size_scatter=15, alpha_scatter=0.2, scale_scatter_size_pop=True, extra_title_text='', headwidth_bundle=0.5, lineage_pathway = [4,7,8], text_labels=False, sc_labels=true_label) f.set_size_inches(15,4) 2023-09-14 15:20:35.211221 Computing Edges 2023-09-14 15:20:35.211722 WARNING: VIA will now autocompute an embedding. It would be better to precompute an embedding using embedding = via_umap() or via_mds() and setting this as the embedding attribute via_object = embedding. 2023-09-14 15:20:35.211722 Commencing Via-MDS 2023-09-14 15:20:35.211722 Resetting n_milestones to 1000 as n_samples > original n_milestones 2023-09-14 15:20:35.472721 Start computing with diffusion power:1 2023-09-14 15:20:35.490731 Starting MDS on milestone 2023-09-14 15:20:35.949219 End computing mds with diffusion power:1 2023-09-14 15:20:35.951221 Start finding milestones 2023-09-14 15:20:36.107220 End milestones with 50 2023-09-14 15:20:36.109223 Recompute weights 2023-09-14 15:20:36.111220 pruning milestone graph based on recomputed weights 2023-09-14 15:20:36.112220 Graph has 1 connected components before pruning 2023-09-14 15:20:36.112721 Graph has 1 connected components after pruning 2023-09-14 15:20:36.112721 Graph has 1 connected components after reconnecting 2023-09-14 15:20:36.113721 regenerate igraph on pruned edges 2023-09-14 15:20:36.121721 Setting numeric label as time_series_labels or other sequential metadata for coloring edges 2023-09-14 15:20:36.128221 Making smooth edges location of 4 is at [0] and 0

TypeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_24624\1755887396.py in 1 f,ax = plot_edge_bundle(via_object=v0, n_milestones=50, linewidth_bundle=1.5, alpha_bundle_factor=2, 2 cmap='plasma', facecolor='white', size_scatter=15, alpha_scatter=0.2, scale_scatter_size_pop=True, ----> 3 extra_title_text='', headwidth_bundle=0.5, lineage_pathway = [4,7,8], text_labels=False, sc_labels=true_label) 4 f.set_size_inches(15,4)

~\AppData\Roaming\Python\Python37\site-packages\pyVIA\plotting_via.py in plot_edge_bundle(hammerbundle_dict, via_object, alpha_bundle_factor, linewidth_bundle, facecolor, cmap, extra_title_text, size_scatter, alpha_scatter, headwidth_bundle, headwidth_alpha, arrow_frequency, show_arrow, sc_labels_sequential, sc_labels_expression, initial_bandwidth, decay, n_milestones, scale_scatter_size_pop, show_milestones, sc_labels, text_labels, lineage_pathway, dpi, fontsize_title, fontsize_labels, global_visual_pruning, use_sc_labels_sequential_for_direction, sc_scatter_size, sc_scatter_alpha) 1136 from matplotlib.patches import Rectangle 1137 sc_embedding = via_object.embedding -> 1138 max_r = np.max(via_object.embedding[:, 0]) + 1 1139 max_l = np.min(via_object.embedding[:, 0]) - 1 1140

TypeError: 'NoneType' object is not subscriptable

— Reply to this email directly, view it on GitHub https://github.com/ShobiStassen/VIA/issues/46, or unsubscribe https://github.com/notifications/unsubscribe-auth/AISI4SELICZ3ARHTWDOLXBTX2KWNFANCNFSM6AAAAAA4XUBZAM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

realzehuali commented 9 months ago

Since I was testing Basic tutorial in docs, I was using adata from datasets_via.toy_multifurcating().

ShobiStassen commented 9 months ago

`from pyVIA.core import * import pyVIA.datasets_via as datasets_via

import pandas as pd import umap import scanpy as sc

adata_counts = datasets_via.toy_multifurcating() print(adata_counts) true_label = adata_counts.obs['group_id'].tolist() ncomps = 30 sc.tl.pca(adata_counts, svd_solver='arpack', n_comps=ncomps)

define parameters

ncomps, knn, random_seed, dataset, root_user =30,20, 42,'toy', ['M1'] embedding = umap.UMAP().fit_transform(adata_counts.obsm['X_pca'][:, 0:10])

v0 = VIA(adata_counts.obsm['X_pca'][:, 0:ncomps], true_label, jac_std_global=0.15, dist_std_local=1, knn=knn, cluster_graph_pruning_std=1, too_big_factor=0.3, root_user=root_user, preserve_disconnected=True, dataset='group', random_seed=random_seed, do_compute_embedding=True, embedding_type='via-mds')#, piegraph_arrow_head_width=0.2, piegraph_edgeweight_scalingfactor=1.0) v0.run_VIA()

fig, ax, ax2= draw_piechart_graph(via_object=v0, type_data='pt', title='Toy multifurcation', cmap='viridis', ax_text=True, gene_exp='', alpha_edge=0.5, linewidth_edge=1.5, edge_color='green', headwidth_arrow=0.2) fig.set_size_inches(10,5)

via_streamplot(v0, embedding)

f, axs = draw_sc_lineage_probability(via_object=v0, via_fine=v0,embedding=embedding) plt.show()

f,ax = plot_edge_bundle(via_object=v0, n_milestones=50, linewidth_bundle=1.5, alpha_bundle_factor=2, cmap='plasma', facecolor='white', size_scatter=15, alpha_scatter=0.2, scale_scatter_size_pop=True, extra_title_text='', headwidth_bundle=0.5, lineage_pathway = [4,7,8], text_labels=False, sc_labels=true_label) f.set_size_inches(15,4) plt.show()`

ShobiStassen commented 9 months ago

hi again, i just ran the code above and it works without any issues for me. is there anything different from what you have ?

ShobiStassen commented 9 months ago

This is the output on the console as the program runs: AnnData object with n_obs × n_vars = 1000 × 1000 obs: 'group_id', 'true_time' 2023-09-14 16:25:52.476157 Running VIA over input data of 1000 (samples) x 30 (features) 2023-09-14 16:25:52.476192 Knngraph has 20 neighbors 2023-09-14 16:25:52.885201 Finished global pruning of 20-knn graph used for clustering at level of 0.15. Kept 46.7 % of edges. 2023-09-14 16:25:52.894815 Number of connected components used for clustergraph is 1 2023-09-14 16:25:52.958612 Commencing community detection 2023-09-14 16:25:52.975584 Finished running Leiden algorithm. Found 43 clusters. 2023-09-14 16:25:52.976926 Merging 30 very small clusters (<10) 2023-09-14 16:25:52.978153 Finished detecting communities. Found 13 communities 2023-09-14 16:25:52.978442 Making cluster graph. Global cluster graph pruning level: 1 2023-09-14 16:25:52.983313 Graph has 1 connected components before pruning 2023-09-14 16:25:52.985384 Graph has 1 connected components after pruning 2023-09-14 16:25:52.985626 Graph has 1 connected components after reconnecting 2023-09-14 16:25:52.986383 0.0% links trimmed from local pruning relative to start 2023-09-14 16:25:52.989378 Run via-mds 2023-09-14 16:25:52.989395 Commencing Via-MDS 2023-09-14 16:25:53.456614 Start computing with diffusion power:5 2023-09-14 16:25:53.564381 Starting MDS on milestone 2023-09-14 16:25:54.536920 End computing mds with diffusion power:5 2023-09-14 16:25:54.538070 Completed via-mds 2023-09-14 16:25:54.968744 Starting make edgebundle viagraph... 2023-09-14 16:25:54.968767 Make via clustergraph edgebundle 2023-09-14 16:25:57.185602 Hammer dims: Nodes shape: (13, 2) Edges shape: (26, 3) 2023-09-14 16:25:57.188250 component number 0 out of [0] 2023-09-14 16:25:57.191702\group root method 2023-09-14 16:25:57.191720or component 0, the root is M1 and ri M1 2023-09-14 16:25:57.196086 New root is 3 and majority M1 2023-09-14 16:25:57.196705 New root is 6 and majority M1 2023-09-14 16:25:57.197792 Computing lazy-teleporting expected hitting times 2023-09-14 16:25:57.765133 Identifying terminal clusters corresponding to unique lineages... 2023-09-14 16:25:57.765171 Closeness:[3, 4, 6, 7, 9, 10] 2023-09-14 16:25:57.765183 Betweenness:[3, 4, 6, 7, 8, 9, 10] 2023-09-14 16:25:57.765189 Out Degree:[3, 4, 7, 8, 9, 10] remove the [0:2] just using to speed up testing remove the [0:2] just using to speed up testing remove the [0:2] just using to speed up testing remove the [0:2] just using to speed up testing remove the [0:2] just using to speed up testing 2023-09-14 16:25:57.765560 Terminal clusters corresponding to unique lineages in this component are [4, 7, 8, 9, 10] 2023-09-14 16:25:58.007396 From root 6, the Terminal state 4 is reached 75 times. 2023-09-14 16:25:58.207230 From root 6, the Terminal state 7 is reached 426 times. 2023-09-14 16:25:58.439499 From root 6, the Terminal state 8 is reached 259 times. 2023-09-14 16:25:58.685563 From root 6, the Terminal state 9 is reached 249 times. 2023-09-14 16:25:58.881924 From root 6, the Terminal state 10 is reached 394 times. 2023-09-14 16:25:58.910010 Terminal clusters corresponding to unique lineages are {4: 'M6', 7: 'M2', 8: 'M5', 9: 'M4', 10: 'M8'} 2023-09-14 16:25:58.910127 Begin projection of pseudotime and lineage likelihood 2023-09-14 16:25:59.082836 Graph has 1 connected components before pruning 2023-09-14 16:25:59.085050 Graph has 1 connected components after pruning 2023-09-14 16:25:59.085301 Graph has 1 connected components after reconnecting 2023-09-14 16:25:59.086103 11.5% links trimmed from local pruning relative to start 2023-09-14 16:25:59.086130 50.0% links trimmed from global pruning relative to start 2023-09-14 16:25:59.092374 Start making edgebundle milestone... 2023-09-14 16:25:59.092409 Start finding milestones /home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of n_init will change from 10 to 'auto' in 1.4. Set the value of n_init explicitly to suppress the warning warnings.warn( 2023-09-14 16:25:59.517811 End milestones with 150 2023-09-14 16:25:59.517838 Will use via-pseudotime for edges, otherwise consider providing a list of numeric labels (single cell level) or via_object 2023-09-14 16:25:59.521005 Recompute weights 2023-09-14 16:25:59.540356 pruning milestone graph based on recomputed weights 2023-09-14 16:25:59.541648 Graph has 1 connected components before pruning 2023-09-14 16:25:59.542571 Graph has 2 connected components after pruning 2023-09-14 16:25:59.544836 Graph has 1 connected components after reconnecting 2023-09-14 16:25:59.545619 60.4% links trimmed from global pruning relative to start 2023-09-14 16:25:59.545646 regenerate igraph on pruned edges 2023-09-14 16:25:59.550557 Setting numeric label as single cell pseudotime for coloring edges /home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/scipy/sparse/_index.py:103: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient. self._set_intXint(row, col, x.flat[0]) 2023-09-14 16:25:59.562728 Making smooth edges No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument. 2023-09-14 16:26:05.317348 Time elapsed 12.7 seconds /home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/pyVIA/plotting_via.py:2854: UserWarning: No data for colormapping provided via 'c'. Parameters 'cmap' will be ignored sct = ax.scatter(node_pos[:, 0], node_pos[:, 1], /home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/scipy/sparse/_index.py:146: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient. self._set_arrayXarray(i, j, x) 2023-09-14 16:26:09.007264 Marker_lineages: [4, 7, 8, 9, 10] 2023-09-14 16:26:09.007866 The number of components in the original full graph is 1 2023-09-14 16:26:09.007882 For downstream visualization purposes we are also constructing a low knn-graph 2023-09-14 16:26:09.489687 Check sc pb 1.0000000000000002 2023-09-14 16:26:09.521729 Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 4: [6, 3, 1, 11, 2, 4] 2023-09-14 16:26:09.521772 Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 7: [6, 3, 1, 12, 0, 5, 7] 2023-09-14 16:26:09.521788 Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 8: [6, 3, 1, 11, 2, 8] 2023-09-14 16:26:09.521802 Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 9: [6, 3, 1, 11, 2, 8, 9] 2023-09-14 16:26:09.521819 Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 10: [6, 3, 1, 12, 0, 5, 10] 2023-09-14 16:26:09.778333 Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 4 along path: [6, 6, 6, 3, 11, 2, 4, 4, 4] 2023-09-14 16:26:09.813373 Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 7 along path: [6, 6, 3, 12, 0, 5, 7, 7] 2023-09-14 16:26:09.850750 Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 8 along path: [6, 6, 6, 3, 11, 2, 8, 8, 8] 2023-09-14 16:26:09.888845 Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 9 along path: [6, 6, 6, 3, 11, 2, 8, 9, 9, 9, 9, 9, 9] 2023-09-14 16:26:09.914286 Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 10 along path: [6, 6, 3, 12, 0, 10, 10, 10] location of 4 is at [0] and 0 location of 7 is at [1] and 1 location of 8 is at [2] and 2

realzehuali commented 9 months ago

Oh, finally I know. It's this part: do_compute_embedding=True, embedding_type='via-mds'. The ipynb 'Basic tutorial.ipynb' in github does not include this part. So after running run_VIA, I can set v0.embedding = embedding to make this work. But I don't know whether this is correct.

realzehuali commented 9 months ago

It is such a pleasure to meet you online in real time. I do have other 2 questions testing this package. 1. When I was running tsi_list = via.get_loc_terminal_states(v0, X_in) in the imaging cytometry ipynb, I met this: AttributeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_25724\3418165575.py in ----> 1 tsi_list = via.get_loc_terminal_states(v0, X_in) 2 3 v1 = via.VIA(X_in, true_label, jac_std_global=jac_std_global, dist_std_local=1, knn=knn, 4 too_big_factor=0.05, super_cluster_labels=v0.labels, super_node_degree_list=v0.node_degree_list, 5 super_terminal_cells=tsi_list, root_user=root_user, is_coarse=False,

AttributeError: module 'pyVIA.core' has no attribute 'get_loc_terminal_states'

Another question if how to save the result of v0 after running v0.run_VIA() for a long time?

Thank you so much for any help!

ShobiStassen commented 9 months ago

hi, nice to meet you too!!

We are generally finding after more usage that you dont have to run via twice (first coarse then fine mode). that means you can skip the second iteration of via and dont worry about passing on the tsi_list with via.get_loc_terminal_states(). if you really want smaller clusters then just set the resolution parameter to >1 (e.g. 2) and see if you like the results more. Or try to run with lower knn value, or make jac_std_global a bit smaller: Controls granularity of clustering. range 0-1 is reasonable. values closer to 0 will result in more and smaller clusters
saving via. this is something we need to develop. for now you can save things like the pseudotime, cluster labels by extracting from the via_object attributes. for now we dont have an easy way of saving the underlying graph which is probably the most time intensive part.

ShobiStassen commented 9 months ago

before calling plot_edge_bundle(), you need to either pass an argument embedding = "some array with the embedding" or pass it first to the via_object as v0.embedding = embedding (whatever embedding you like. When the do_compute_embedding =True, then via_object.embedding is auto-computed and set during run_via()

realzehuali commented 9 months ago

Thank you so much! That helps me a lot! I know my issues are basic. Thank you very much for being so patient with me.

ShobiStassen commented 9 months ago

i'm very happy to be able to help! are you using Via for flow cytometry data? i'd love to know if the visualizations and pathways work out on your data. always nice to hear feedback. dont hesitate to reach out

realzehuali commented 9 months ago

Yes, for now I'm using VIA with scRNAseq and it worked after solving the embedding problem above. I also have CyTOF data and MALDI Imaging mass spectrometry data. I think VIA will work for CyTOF. I'm not sure whether it will work on MALDI data or other forms of metabonomics data or spatial data like spatial transcriptomics.

realzehuali commented 9 months ago

Also, does it make sense to use VIA on bulk RNAseq data with hundreds or thousands of samples. If it does, can VIA deal with the , batch effect?

ShobiStassen commented 9 months ago

Via does not explicitly deal with batch effect. normally you could run something like Harmony on the PCs before providing to Via. However, if you are dealing with time series data or something where the "batches" have useful information that indicates sequence/order/chronology, then you maybe want to keep that information.

ShobiStassen commented 9 months ago

generally if you have some thousands of samples, then you could try Via. if the sample size is very small <<1000, then depends on the data

ShobiStassen / VIA

TypeError: 'NoneType' object is not subscriptable #46

define parameters