Closed kpandey008 closed 2 years ago
Hi, Let me have a look in a couple of days, but it seems there is some issue with the root index (I'm guessing this is your own data and not the sample data, right?) My first guess would be that the index for the second root is is not aligned with the index numbering it has in the in the second component. Can you try by just setting the root to [1,1]? Then we would know if that is where the problem stems from? In the example 1b for disconnected toy, each root is designated to a particular component and this means labelling the true_label similar to how the Toy data labels are formatted for the disconnected toy data and then letting the data parameter be specified as "Toy4"- but before we go down that path (as I need to have a closer look myself) lets see what happens when you arbitrarily set the root to [1,1].
hi, i think the issue is now fixed and you can provide the indices of the cells corresponding to the roots of each component. you can clone the latest version on github or pip install pyVIA (version 0.1.19). also, if you want you can test it out based on the code in test_pyVIA.py by modifying the function:
´´´ def run_generic_discon(foldername ="/home/shobi/Trajectory/Datasets/Toy4/"):
df_counts = pd.read_csv(foldername + "toy_disconnected_M9_n1000d1000.csv",
delimiter=",")
df_ids = pd.read_csv(foldername + "toy_disconnected_M9_n1000d1000_ids_with_truetime.csv", delimiter=",")
df_ids['cell_id_num'] = [int(s[1::]) for s in df_ids['cell_id']]
df_counts = df_counts.drop('Unnamed: 0', 1)
df_ids = df_ids.sort_values(by=['cell_id_num'])
df_ids = df_ids.reset_index(drop=True)
true_label = df_ids['group_id']
true_label =['a' for i in true_label] #testing dummy true_label and overwriting the real true_labels
true_time = df_ids['true_time']
adata_counts = sc.AnnData(df_counts, obs=df_ids)
sc.tl.pca(adata_counts, svd_solver='arpack', n_comps=100)
via.via_wrapper_disconnected(adata_counts, true_label, embedding=adata_counts.obsm['X_pca'][:, 0:2], root=[23, 902],
preserve_disconnected=True, knn=10, ncomps=30, cluster_graph_pruning_std=1, random_seed=41)
´´´
@ShobiStassen The issue is resolved with the new release. Thank you for looking into it! Closing this issue
Hi,
I am trying to run VIA on a generic disconnected dataset using the following code as outlined in the README file:
However I am getting the following stacktrace:
FWIW, the same code works fine for multifurcating datasets so I'm not sure if I'm missing something. Anyhelp would be appreciated!