BiomedicalMachineLearning / stLearn

A novel machine learning pipeline to analyse spatial transcriptomics data
Other
176 stars 23 forks source link

How to add my own LR pair? #259

Closed Knight1995 closed 8 months ago

Knight1995 commented 8 months ago

Great jobs! I try to use my own collected LR pair, but I encounter the following error. How can i solve it? And the format of ids_arr is the same as tutotrial format.Thanks.

InvalidIndexError Traceback (most recent call last) Cell In[42], line 38 35 print(len(lrs)) 37 # Running the analysis # ---> 38 st.tl.cci.run(data, ids_arr, 39 min_spots = 20, #Filter out any LR pairs with no scores for less than min_spots 40 distance=0, # None defaults to spot+immediate neighbours; distance=0 for within-spot mode 41 n_pairs=10000, # Number of random pairs to generate; low as example, recommend ~10,000 42 n_cpus=30, # Number of CPUs for parallel. If None, detects & use all available. 43 )

File ~/miniconda3/envs/stlearn/lib/python3.8/site-packages/stlearn/tools/microenv/cci/analysis.py:337, in run(adata, lrs, min_spots, distance, n_pairs, n_cpus, use_label, adj_method, pval_adj_cutoff, min_expr, save_bg, neg_binom, verbose) 333 return 335 """ Permutation methods generating background per spot, & test lrs in spot. 336 """ --> 337 perform_spot_testing( 338 adata, 339 lr_scores, 340 lrs, 341 n_pairs, 342 neighbours, 343 het_vals, 344 min_expr, 345 adj_method, 346 pval_adj_cutoff, 347 verbose, 348 save_bg=save_bg, 349 neg_binom=neg_binom, 350 )

File ~/miniconda3/envs/stlearn/lib/python3.8/site-packages/stlearn/tools/microenv/cci/permutation.py:59, in perform_spot_testing(adata, lr_scores, lrs, n_pairs, neighbours, het_vals, min_expr, adj_method, pval_adj_cutoff, verbose, save_bg, neg_binom, quantiles) 57 ####### Quantiles to select similar gene to LRs to gen. rand-pairs ####### 58 lr_expr = adata[:, lr_genes].to_df() ---> 59 lr_feats = get_lr_features(adata, lr_expr, lrs, quantiles) 60 l_quants = lr_feats.loc[ 61 lrs, [col for col in lrfeats.columns if "L" in col] 62 ].values 63 r_quants = lr_feats.loc[ 64 lrs, [col for col in lrfeats.columns if "R" in col] 65 ].values

File ~/miniconda3/envs/stlearn/lib/python3.8/site-packages/stlearn/tools/microenv/cci/perm_utils.py:325, in get_lr_features(adata, lr_expr, lrs, quantiles) 321 lrcols = [f"L{quant}" for quant in quantiles] + [ 322 f"R_{quant}" for quant in quantiles 323 ] 324 quant_df = pd.DataFrame(lr_quants, columns=lr_cols, index=lrs) --> 325 lr_features = pd.concat((lr_features, quant_df), axis=1) 326 adata.uns["lrfeatures"] = lr_features 328 return lr_features

File ~/miniconda3/envs/stlearn/lib/python3.8/site-packages/pandas/core/reshape/concat.py:385, in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy) 370 copy = False 372 op = _Concatenator( 373 objs, 374 axis=axis, (...) 382 sort=sort, 383 ) --> 385 return op.get_result()

File ~/miniconda3/envs/stlearn/lib/python3.8/site-packages/pandas/core/reshape/concat.py:612, in _Concatenator.get_result(self) 610 obj_labels = obj.axes[1 - ax] 611 if not new_labels.equals(obj_labels): --> 612 indexers[ax] = obj_labels.get_indexer(new_labels) 614 mgrs_indexers.append((obj._mgr, indexers)) 616 new_data = concatenate_managers( 617 mgrs_indexers, self.new_axes, concat_axis=self.bm_axis, copy=self.copy 618 )

File ~/miniconda3/envs/stlearn/lib/python3.8/site-packages/pandas/core/indexes/base.py:3731, in Index.get_indexer(self, target, method, limit, tolerance) 3728 self._check_indexing_method(method, limit, tolerance) 3730 if not self._index_as_unique: -> 3731 raise InvalidIndexError(self._requires_unique_msg) 3733 if len(target) == 0: 3734 return np.array([], dtype=np.intp)

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

duypham2108 commented 8 months ago

Make sure again the format like index, header name should be the same as https://github.com/BiomedicalMachineLearning/stLearn/blob/master/stlearn/tools/microenv/cci/databases/connectomeDB2020_lit.txt If it still doesn't work, can you send me a header of your csv file (5 lines)? I can check it

Knight1995 commented 8 months ago

Thanks for your reply.My code and input file are as follows: df = pd.read_csv('data.csv') ids_arr =df['x'] ids_arr = ids_arr.astype(object) image data.csv

duypham2108 commented 8 months ago

Can you try this ids_arr=ids_arr.values? Also you could try to clean all NaN or infinity values in your array

Knight1995 commented 8 months ago

Thanks for your reply! It works now. I will close this issue.