ShobiStassen / VIA

trajectory inference
https://pyvia.readthedocs.io/en/latest/
MIT License
86 stars 21 forks source link

timeseries/timelabel aided analysis #64

Closed TimmFH closed 5 months ago

TimmFH commented 6 months ago

Hey, first thank you for an interesting package!

I am aiming to perform a timelabel aided analysis building up on: https://pyvia.readthedocs.io/en/latest/notebooks/mESC_timeseries.html

Could you provide or point towards the csv file used in this example, so I can follow the data structure and confirm everything runs well.

Additionally, is there also an example workflow for a timelabel based trajectory computation based on scRNAseq data?

Thank you for your time and all the best Timm

TimmFH commented 6 months ago

I realised it works fine to just plug in adata_counts.obsm['X_pca'][:, 0:ncomps] as in your Basic explanations.

In my dataset cells are activated and assume then a similar position to the starting state. Is it possible to give more weight to the sequential timelabel to capture the transition back to a later state that resembles the starting state, resulting in a continuously increasing pseudotime assignment?

Additionally I was wondering wether there is a discrepancy in the connectivity plot assigned pseudotime, which seems to capture this transition somehow according to the pseudotime legend, in contrast to the final pseudotimeplot, which shows a different pseudotime assignment.

Thank you for any advise to adjust parameters!

These were my basic settings:

adata_counts = sc.read("file/scexp.h5ad")
true_labels_numeric = adata_counts.obs['condition'].tolist()
ncomps = 30
sc.tl.pca(adata_counts, svd_solver='arpack', n_comps=ncomps)

#set(true_labels_numeric) output to {0.0, 1.0, 2.0, 3.0, 4.0}

knn=40
cluster_graph_pruning = 0.15
edgepruning_clustering_resolution=0.3
random_seed = 0
knn_sequential = 20

root = [0.0] #since the root corresponds to a group level initial state  we set dataset = 'group'
v0 = VIA(adata_counts.obsm['X_pca'][:, 0:ncomps], true_label=true_labels_numeric, edgepruning_clustering_resolution=edgepruning_clustering_resolution, edgepruning_clustering_resolution_local=1, knn=knn,
         cluster_graph_pruning=cluster_graph_pruning,  piegraph_arrow_head_width=0.6,
         too_big_factor=0.3, resolution_parameter=1,
         root_user=root, dataset='group', random_seed=random_seed,
         is_coarse=True, preserve_disconnected=False, pseudotime_threshold_TS=40, x_lazy=0.99,
         alpha_teleport=0.99, time_series=True, time_series_labels=true_labels_numeric, t_diff_step=2,
         edgebundle_pruning_twice=False, knn_sequential=knn_sequential, knn_sequential_reverse=knn_sequential)
TimmFH commented 6 months ago

output1

output

ShobiStassen commented 6 months ago

Dear Timm,

Thanks for trying out StaVia (Via 2.0), There is a link on the github page to the datasets https://github.com/ShobiStassen/VIA?tab=readme-ov-file#datasets which should have the mESC datafile

I'm currently traveling with very limited internet access until mid next week, but hopefully this has the file you need. For scrna-seq, you can use the StaVia tutorials for mouse and Zebrafish gastrulation on the readthedocs page Let me know if you have any more questions Shobi

On Thu, 28 Mar 2024, 06:48 TimmFH, @.***> wrote:

Hey, first thank you for an interesting package!

I am aiming to perform a timelabel aided analysis building up on: https://pyvia.readthedocs.io/en/latest/notebooks/mESC_timeseries.html

Could you provide or point towards the csv file used in this example, so I can follow the data structure and confirm everything runs well.

Additionally, is there also an example workflow for a timelabel based trajectory computation based on scRNAseq data?

Thank you for your time and all the best Timm

— Reply to this email directly, view it on GitHub https://github.com/ShobiStassen/VIA/issues/64, or unsubscribe https://github.com/notifications/unsubscribe-auth/AISI4SEG73ANZZISYRLTYB3Y2M5BPAVCNFSM6AAAAABFLYJH7OVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYTEMBSGUYTMNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

ShobiStassen commented 6 months ago

hi Timm, Since your via_object has time_series values, the plot_atlas_view() function defaults to plotting these unless another value is provided in :param sc_labels_expression: list single cell numeric values used for coloring edges and nodes of corresponding milestones mean expression levels (len n_single_cell samples) edges can be colored by time-series numeric (gene expression)/string (cell type) labels, pseudotime, or gene expression. If not specificed then time-series is chosen if available, otherwise falls back to pseudotime. to use gene expression the sc_labels_expression is provided as a list

It seems like the 0.0 time state is throwing off the analysis, whereas 1.0-4.0 seem more or less chronological. I'm not sure I follow your question on emphasizing sequential transitions. you can for instance increase the knn_sequential and knn_reverse. I would not recommend further lowering t_diff_step=2, but you could try to relax it in case you are missing some desired transitions.

I would recommend plotting the atlas view with fewer edges, you can see some examples in this tutorial under "The Atlas View".