Closed yao50985098 closed 2 months ago
I solve this by deleting the column name in the 'tissue_positions_list.csv'. But I'm facing another error, When running 'binning_and_plotting.bin_data()', it seems the 'pseudo_counts_mat' has an index which does not correspond to the 'idx_kept' generated by 'idx_kept, gene_labels_idx=filter_genes.filter_genes(counts_mat, gene_labels,umi_threshold=umi_thresh, exclude_prefix=['MT-', 'RPL', 'RPS'])'.
I tried different samples, but got similar error. Is there any advice? Thank you!
Hi! Thanks for your interest in GASTON. Sorry for my late reply.
For your first point, which column did you have to remove? I probably need to update the code to use a standard Visium functions for reading 10x output.
For the second point, I updated the code to use an NxG counts matrix instead of a GxN matrix. However I don't think I updated the tutorial. Could you try using counts_mat_restrict.T
instead in the binning_and_plotting
function, ie use the transpose? Please let me know if this works.
Yes, 'counts_mat_restrict.T' works, thanks! In my output, there's an extra headline in 'tissue_positions_list.csv' like this: Thanks again for your kind reply.
I just fixed the bug in binning_and_plotting
so now you should not have to use counts_mat_restrict.T
. Also I updated the code to use SquidPy to read the 10x output, so hopefully your tissue_positions_list.csv
bug is fixed.
Hi, thank you for the nice tool! I'm trying to run my own visium data with standard Space Ranger output(filtered_feature_bc_matrix.h5 and spatial/tissue_positions_list.csv are provided), but got this error.
gaston-package/lib/python3.11/site-packages/anndata/_core/anndata.py:1840: UserWarning: Variable names are not unique. To make them unique, call
.var_names_make_unique
. utils.warn_names_duplicates("var") gaston-package/lib/python3.11/site-packages/anndata/_core/anndata.py:1113: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead if not is_categorical_dtype(df_full[k]): gaston-package/lib/python3.11/site-packages/anndata/_core/anndata.py:1840: UserWarning: Variable names are not unique. To make them unique, call.var_names_make_unique
. utils.warn_names_duplicates("var")AssertionError Traceback (most recent call last) Cell In[9], line 4 2 #use_RGB=True # set to False if you do not want to use RGB as features 3 use_RGB=False ----> 4 counts_mat, coords_mat, gene_labels, rgb_mean=parse_adata.get_gaston_input_adata(data_folder, get_rgb=use_RGB, spot_umi_threshold=50) 6 # save matrices 7 np.save('colorectal_tumor_data/counts_mat.npy', counts_mat)
File ~/GASTON-main/src/gaston/parse_adata.py:11, in get_gaston_input_adata(data_folder, get_rgb, spot_umi_threshold) 9 df_pos = pd.read_csv(f'{data_folder}/spatial/tissue_positions_list.csv', sep=",", header=None, names=["barcode", "in_tissue", "array_row", "array_col", "pxl_row_in_fullres", "pxl_col_in_fullres"]) 10 df_pos = df_pos[df_pos.in_tissue == True] ---> 11 assert set(list(df_pos.barcode)) == set(list(adata.obs.index)) 12 df_pos.barcode = pd.Categorical(df_pos.barcode, categories=list(adata.obs.index), ordered=True) 13 df_pos.sort_values(by="barcode", inplace=True)
AssertionError:
Is there any advice? Thank you!