Closed realzehuali closed 9 months ago
Hi, can I ask what is the size of the input data ?
On Thu, 14 Sept 2023, 15:25 realzehuali, @.***> wrote:
Thank you for this amazing tool! I was testing Basic tutorial in docs and met this error:
f,ax = plot_edge_bundle(via_object=v0, n_milestones=50, linewidth_bundle=1.5, alpha_bundle_factor=2, cmap='plasma', facecolor='white', size_scatter=15, alpha_scatter=0.2, scale_scatter_size_pop=True, extra_title_text='', headwidth_bundle=0.5, lineage_pathway = [4,7,8], text_labels=False, sc_labels=true_label) f.set_size_inches(15,4) 2023-09-14 15:20:35.211221 Computing Edges 2023-09-14 15:20:35.211722 WARNING: VIA will now autocompute an embedding. It would be better to precompute an embedding using embedding = via_umap() or via_mds() and setting this as the embedding attribute via_object = embedding. 2023-09-14 15:20:35.211722 Commencing Via-MDS 2023-09-14 15:20:35.211722 Resetting n_milestones to 1000 as n_samples > original n_milestones 2023-09-14 15:20:35.472721 Start computing with diffusion power:1 2023-09-14 15:20:35.490731 Starting MDS on milestone 2023-09-14 15:20:35.949219 End computing mds with diffusion power:1 2023-09-14 15:20:35.951221 Start finding milestones 2023-09-14 15:20:36.107220 End milestones with 50 2023-09-14 15:20:36.109223 Recompute weights 2023-09-14 15:20:36.111220 pruning milestone graph based on recomputed weights 2023-09-14 15:20:36.112220 Graph has 1 connected components before pruning 2023-09-14 15:20:36.112721 Graph has 1 connected components after pruning 2023-09-14 15:20:36.112721 Graph has 1 connected components after reconnecting 2023-09-14 15:20:36.113721 regenerate igraph on pruned edges 2023-09-14 15:20:36.121721 Setting numeric label as time_series_labels or other sequential metadata for coloring edges 2023-09-14 15:20:36.128221 Making smooth edges location of 4 is at [0] and 0
TypeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_24624\1755887396.py in 1 f,ax = plot_edge_bundle(via_object=v0, n_milestones=50, linewidth_bundle=1.5, alpha_bundle_factor=2, 2 cmap='plasma', facecolor='white', size_scatter=15, alpha_scatter=0.2, scale_scatter_size_pop=True, ----> 3 extra_title_text='', headwidth_bundle=0.5, lineage_pathway = [4,7,8], text_labels=False, sc_labels=true_label) 4 f.set_size_inches(15,4)
~\AppData\Roaming\Python\Python37\site-packages\pyVIA\plotting_via.py in plot_edge_bundle(hammerbundle_dict, via_object, alpha_bundle_factor, linewidth_bundle, facecolor, cmap, extra_title_text, size_scatter, alpha_scatter, headwidth_bundle, headwidth_alpha, arrow_frequency, show_arrow, sc_labels_sequential, sc_labels_expression, initial_bandwidth, decay, n_milestones, scale_scatter_size_pop, show_milestones, sc_labels, text_labels, lineage_pathway, dpi, fontsize_title, fontsize_labels, global_visual_pruning, use_sc_labels_sequential_for_direction, sc_scatter_size, sc_scatter_alpha) 1136 from matplotlib.patches import Rectangle 1137 sc_embedding = via_object.embedding -> 1138 max_r = np.max(via_object.embedding[:, 0]) + 1 1139 max_l = np.min(via_object.embedding[:, 0]) - 1 1140
TypeError: 'NoneType' object is not subscriptable
— Reply to this email directly, view it on GitHub https://github.com/ShobiStassen/VIA/issues/46, or unsubscribe https://github.com/notifications/unsubscribe-auth/AISI4SELICZ3ARHTWDOLXBTX2KWNFANCNFSM6AAAAAA4XUBZAM . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Since I was testing Basic tutorial in docs, I was using adata from datasets_via.toy_multifurcating().
`from pyVIA.core import * import pyVIA.datasets_via as datasets_via
import pandas as pd import umap import scanpy as sc
adata_counts = datasets_via.toy_multifurcating() print(adata_counts) true_label = adata_counts.obs['group_id'].tolist() ncomps = 30 sc.tl.pca(adata_counts, svd_solver='arpack', n_comps=ncomps)
ncomps, knn, random_seed, dataset, root_user =30,20, 42,'toy', ['M1'] embedding = umap.UMAP().fit_transform(adata_counts.obsm['X_pca'][:, 0:10])
v0 = VIA(adata_counts.obsm['X_pca'][:, 0:ncomps], true_label, jac_std_global=0.15, dist_std_local=1, knn=knn, cluster_graph_pruning_std=1, too_big_factor=0.3, root_user=root_user, preserve_disconnected=True, dataset='group', random_seed=random_seed, do_compute_embedding=True, embedding_type='via-mds')#, piegraph_arrow_head_width=0.2, piegraph_edgeweight_scalingfactor=1.0) v0.run_VIA()
fig, ax, ax2= draw_piechart_graph(via_object=v0, type_data='pt', title='Toy multifurcation', cmap='viridis', ax_text=True, gene_exp='', alpha_edge=0.5, linewidth_edge=1.5, edge_color='green', headwidth_arrow=0.2) fig.set_size_inches(10,5)
via_streamplot(v0, embedding)
f, axs = draw_sc_lineage_probability(via_object=v0, via_fine=v0,embedding=embedding) plt.show()
f,ax = plot_edge_bundle(via_object=v0, n_milestones=50, linewidth_bundle=1.5, alpha_bundle_factor=2, cmap='plasma', facecolor='white', size_scatter=15, alpha_scatter=0.2, scale_scatter_size_pop=True, extra_title_text='', headwidth_bundle=0.5, lineage_pathway = [4,7,8], text_labels=False, sc_labels=true_label) f.set_size_inches(15,4) plt.show()`
hi again, i just ran the code above and it works without any issues for me. is there anything different from what you have ?
This is the output on the console as the program runs:
AnnData object with n_obs × n_vars = 1000 × 1000
obs: 'group_id', 'true_time'
2023-09-14 16:25:52.476157 Running VIA over input data of 1000 (samples) x 30 (features)
2023-09-14 16:25:52.476192 Knngraph has 20 neighbors
2023-09-14 16:25:52.885201 Finished global pruning of 20-knn graph used for clustering at level of 0.15. Kept 46.7 % of edges.
2023-09-14 16:25:52.894815 Number of connected components used for clustergraph is 1
2023-09-14 16:25:52.958612 Commencing community detection
2023-09-14 16:25:52.975584 Finished running Leiden algorithm. Found 43 clusters.
2023-09-14 16:25:52.976926 Merging 30 very small clusters (<10)
2023-09-14 16:25:52.978153 Finished detecting communities. Found 13 communities
2023-09-14 16:25:52.978442 Making cluster graph. Global cluster graph pruning level: 1
2023-09-14 16:25:52.983313 Graph has 1 connected components before pruning
2023-09-14 16:25:52.985384 Graph has 1 connected components after pruning
2023-09-14 16:25:52.985626 Graph has 1 connected components after reconnecting
2023-09-14 16:25:52.986383 0.0% links trimmed from local pruning relative to start
2023-09-14 16:25:52.989378 Run via-mds
2023-09-14 16:25:52.989395 Commencing Via-MDS
2023-09-14 16:25:53.456614 Start computing with diffusion power:5
2023-09-14 16:25:53.564381 Starting MDS on milestone
2023-09-14 16:25:54.536920 End computing mds with diffusion power:5
2023-09-14 16:25:54.538070 Completed via-mds
2023-09-14 16:25:54.968744 Starting make edgebundle viagraph...
2023-09-14 16:25:54.968767 Make via clustergraph edgebundle
2023-09-14 16:25:57.185602 Hammer dims: Nodes shape: (13, 2) Edges shape: (26, 3)
2023-09-14 16:25:57.188250 component number 0 out of [0]
2023-09-14 16:25:57.191702\group root method
2023-09-14 16:25:57.191720or component 0, the root is M1 and ri M1
2023-09-14 16:25:57.196086 New root is 3 and majority M1
2023-09-14 16:25:57.196705 New root is 6 and majority M1
2023-09-14 16:25:57.197792 Computing lazy-teleporting expected hitting times
2023-09-14 16:25:57.765133 Identifying terminal clusters corresponding to unique lineages...
2023-09-14 16:25:57.765171 Closeness:[3, 4, 6, 7, 9, 10]
2023-09-14 16:25:57.765183 Betweenness:[3, 4, 6, 7, 8, 9, 10]
2023-09-14 16:25:57.765189 Out Degree:[3, 4, 7, 8, 9, 10]
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
remove the [0:2] just using to speed up testing
2023-09-14 16:25:57.765560 Terminal clusters corresponding to unique lineages in this component are [4, 7, 8, 9, 10]
2023-09-14 16:25:58.007396 From root 6, the Terminal state 4 is reached 75 times.
2023-09-14 16:25:58.207230 From root 6, the Terminal state 7 is reached 426 times.
2023-09-14 16:25:58.439499 From root 6, the Terminal state 8 is reached 259 times.
2023-09-14 16:25:58.685563 From root 6, the Terminal state 9 is reached 249 times.
2023-09-14 16:25:58.881924 From root 6, the Terminal state 10 is reached 394 times.
2023-09-14 16:25:58.910010 Terminal clusters corresponding to unique lineages are {4: 'M6', 7: 'M2', 8: 'M5', 9: 'M4', 10: 'M8'}
2023-09-14 16:25:58.910127 Begin projection of pseudotime and lineage likelihood
2023-09-14 16:25:59.082836 Graph has 1 connected components before pruning
2023-09-14 16:25:59.085050 Graph has 1 connected components after pruning
2023-09-14 16:25:59.085301 Graph has 1 connected components after reconnecting
2023-09-14 16:25:59.086103 11.5% links trimmed from local pruning relative to start
2023-09-14 16:25:59.086130 50.0% links trimmed from global pruning relative to start
2023-09-14 16:25:59.092374 Start making edgebundle milestone...
2023-09-14 16:25:59.092409 Start finding milestones
/home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of n_init
will change from 10 to 'auto' in 1.4. Set the value of n_init
explicitly to suppress the warning
warnings.warn(
2023-09-14 16:25:59.517811 End milestones with 150
2023-09-14 16:25:59.517838 Will use via-pseudotime for edges, otherwise consider providing a list of numeric labels (single cell level) or via_object
2023-09-14 16:25:59.521005 Recompute weights
2023-09-14 16:25:59.540356 pruning milestone graph based on recomputed weights
2023-09-14 16:25:59.541648 Graph has 1 connected components before pruning
2023-09-14 16:25:59.542571 Graph has 2 connected components after pruning
2023-09-14 16:25:59.544836 Graph has 1 connected components after reconnecting
2023-09-14 16:25:59.545619 60.4% links trimmed from global pruning relative to start
2023-09-14 16:25:59.545646 regenerate igraph on pruned edges
2023-09-14 16:25:59.550557 Setting numeric label as single cell pseudotime for coloring edges
/home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/scipy/sparse/_index.py:103: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
self._set_intXint(row, col, x.flat[0])
2023-09-14 16:25:59.562728 Making smooth edges
No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
2023-09-14 16:26:05.317348 Time elapsed 12.7 seconds
/home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/pyVIA/plotting_via.py:2854: UserWarning: No data for colormapping provided via 'c'. Parameters 'cmap' will be ignored
sct = ax.scatter(node_pos[:, 0], node_pos[:, 1],
/home/user/anaconda3/envs/Via2Env_py10/lib/python3.10/site-packages/scipy/sparse/_index.py:146: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
self._set_arrayXarray(i, j, x)
2023-09-14 16:26:09.007264 Marker_lineages: [4, 7, 8, 9, 10]
2023-09-14 16:26:09.007866 The number of components in the original full graph is 1
2023-09-14 16:26:09.007882 For downstream visualization purposes we are also constructing a low knn-graph
2023-09-14 16:26:09.489687 Check sc pb 1.0000000000000002
2023-09-14 16:26:09.521729 Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 4: [6, 3, 1, 11, 2, 4]
2023-09-14 16:26:09.521772 Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 7: [6, 3, 1, 12, 0, 5, 7]
2023-09-14 16:26:09.521788 Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 8: [6, 3, 1, 11, 2, 8]
2023-09-14 16:26:09.521802 Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 9: [6, 3, 1, 11, 2, 8, 9]
2023-09-14 16:26:09.521819 Cluster path on clustergraph starting from Root Cluster 6 to Terminal Cluster 10: [6, 3, 1, 12, 0, 5, 10]
2023-09-14 16:26:09.778333 Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 4 along path: [6, 6, 6, 3, 11, 2, 4, 4, 4]
2023-09-14 16:26:09.813373 Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 7 along path: [6, 6, 3, 12, 0, 5, 7, 7]
2023-09-14 16:26:09.850750 Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 8 along path: [6, 6, 6, 3, 11, 2, 8, 8, 8]
2023-09-14 16:26:09.888845 Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 9 along path: [6, 6, 6, 3, 11, 2, 8, 9, 9, 9, 9, 9, 9]
2023-09-14 16:26:09.914286 Revised Cluster level path on sc-knnGraph from Root Cluster 6 to Terminal Cluster 10 along path: [6, 6, 3, 12, 0, 10, 10, 10]
location of 4 is at [0] and 0
location of 7 is at [1] and 1
location of 8 is at [2] and 2
Oh, finally I know. It's this part: do_compute_embedding=True, embedding_type='via-mds'. The ipynb 'Basic tutorial.ipynb' in github does not include this part. So after running run_VIA, I can set v0.embedding = embedding to make this work. But I don't know whether this is correct.
It is such a pleasure to meet you online in real time. I do have other 2 questions testing this package. 1. When I was running tsi_list = via.get_loc_terminal_states(v0, X_in) in the imaging cytometry ipynb, I met this: AttributeError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_25724\3418165575.py in
AttributeError: module 'pyVIA.core' has no attribute 'get_loc_terminal_states'
Another question if how to save the result of v0 after running v0.run_VIA() for a long time?
Thank you so much for any help!
hi, nice to meet you too!!
jac_std_global
a bit smaller: Controls granularity of clustering. range 0-1 is reasonable. values closer to 0 will result in more and smaller clusters Thank you so much! That helps me a lot! I know my issues are basic. Thank you very much for being so patient with me.
i'm very happy to be able to help! are you using Via for flow cytometry data? i'd love to know if the visualizations and pathways work out on your data. always nice to hear feedback. dont hesitate to reach out
Yes, for now I'm using VIA with scRNAseq and it worked after solving the embedding problem above. I also have CyTOF data and MALDI Imaging mass spectrometry data. I think VIA will work for CyTOF. I'm not sure whether it will work on MALDI data or other forms of metabonomics data or spatial data like spatial transcriptomics.
Also, does it make sense to use VIA on bulk RNAseq data with hundreds or thousands of samples. If it does, can VIA deal with the , batch effect?
Via does not explicitly deal with batch effect. normally you could run something like Harmony on the PCs before providing to Via. However, if you are dealing with time series data or something where the "batches" have useful information that indicates sequence/order/chronology, then you maybe want to keep that information.
generally if you have some thousands of samples, then you could try Via. if the sample size is very small <<1000, then depends on the data
Thank you for this amazing tool! I was testing Basic tutorial in docs and met this error:
f,ax = plot_edge_bundle(via_object=v0, n_milestones=50, linewidth_bundle=1.5, alpha_bundle_factor=2, cmap='plasma', facecolor='white', size_scatter=15, alpha_scatter=0.2, scale_scatter_size_pop=True, extra_title_text='', headwidth_bundle=0.5, lineage_pathway = [4,7,8], text_labels=False, sc_labels=true_label) f.set_size_inches(15,4)
2023-09-14 15:20:35.211221 Computing Edges 2023-09-14 15:20:35.211722 WARNING: VIA will now autocompute an embedding. It would be better to precompute an embedding using embedding = via_umap() or via_mds() and setting this as the embedding attribute via_object = embedding. 2023-09-14 15:20:35.211722 Commencing Via-MDS 2023-09-14 15:20:35.211722 Resetting n_milestones to 1000 as n_samples > original n_milestones 2023-09-14 15:20:35.472721 Start computing with diffusion power:1 2023-09-14 15:20:35.490731 Starting MDS on milestone 2023-09-14 15:20:35.949219 End computing mds with diffusion power:1 2023-09-14 15:20:35.951221 Start finding milestones 2023-09-14 15:20:36.107220 End milestones with 50 2023-09-14 15:20:36.109223 Recompute weights 2023-09-14 15:20:36.111220 pruning milestone graph based on recomputed weights 2023-09-14 15:20:36.112220 Graph has 1 connected components before pruning 2023-09-14 15:20:36.112721 Graph has 1 connected components after pruning 2023-09-14 15:20:36.112721 Graph has 1 connected components after reconnecting 2023-09-14 15:20:36.113721 regenerate igraph on pruned edges 2023-09-14 15:20:36.121721 Setting numeric label as time_series_labels or other sequential metadata for coloring edges 2023-09-14 15:20:36.128221 Making smooth edges location of 4 is at [0] and 0
TypeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_24624\1755887396.py in
1 f,ax = plot_edge_bundle(via_object=v0, n_milestones=50, linewidth_bundle=1.5, alpha_bundle_factor=2,
2 cmap='plasma', facecolor='white', size_scatter=15, alpha_scatter=0.2, scale_scatter_size_pop=True,
----> 3 extra_title_text='', headwidth_bundle=0.5, lineage_pathway = [4,7,8], text_labels=False, sc_labels=true_label)
4 f.set_size_inches(15,4)
~\AppData\Roaming\Python\Python37\site-packages\pyVIA\plotting_via.py in plot_edge_bundle(hammerbundle_dict, via_object, alpha_bundle_factor, linewidth_bundle, facecolor, cmap, extra_title_text, size_scatter, alpha_scatter, headwidth_bundle, headwidth_alpha, arrow_frequency, show_arrow, sc_labels_sequential, sc_labels_expression, initial_bandwidth, decay, n_milestones, scale_scatter_size_pop, show_milestones, sc_labels, text_labels, lineage_pathway, dpi, fontsize_title, fontsize_labels, global_visual_pruning, use_sc_labels_sequential_for_direction, sc_scatter_size, sc_scatter_alpha) 1136 from matplotlib.patches import Rectangle 1137 sc_embedding = via_object.embedding -> 1138 max_r = np.max(via_object.embedding[:, 0]) + 1 1139 max_l = np.min(via_object.embedding[:, 0]) - 1 1140
TypeError: 'NoneType' object is not subscriptable