Closed TengyuZz closed 5 months ago
Hi @TengyuZz thank you very much for trying out CellCharter!
What I believe is happening is that the features in X_cellcharter
are in sparse format.
For now, the fitting of the Gaussian Mixture Model only supports the dense format, but I will try to fix it as soon as I can.
However, the fact that you have sparse features makes me suspect that you haven't done dimensionality reduction and I have never tested CellCharter on the full features without dimensionality reduction, but I don't suggest it.
Can you tell me how many features you have in the anndata, for example running adata.obsm['X_cellcharter'].shape
?
If you have a small number of features you can just run adata.obsm['X_cellcharter'] = adata.obsm['X_cellcharter'].toarray()
, but be aware that if you have a lot of features, the dataset may occupy a lot of memory.
Hi @marcovarrone , Thanks for your kind reply. I followed the Nanostring CosMx tutorials here: https://cellcharter.readthedocs.io/en/latest/notebooks/cosmx_human_nsclc.html
I have running the dimensionality reduction steps as below:
When I running adata.obsm['X_cellcharter'].shape, here is the output:
My CosMx data is 960-panel, is that would influenced? Many thanks for helping me figure this problem out. I am really interested in Cellcharter for its strong function!
Is it possible that you forgot adata.obsm['X_scVI'] = model.get_latent_representation(adata).astype(np.float32)
?
Remember also to run cc.gr.aggregate_neighbors(adata, n_layers=3, use_rep='X_scVI', out_key='X_cellcharter')
before fitting the GMM.
One other thing, unless you have noticed some strong batch effects between fovs, I would actually run scVI without the batch_key
parameter. Of course, if you see a lot of bias between field of views, keep that parameter!
Thanks @marcovarrone, I have resolved the problem and running successfully. As well, really appreciate for your help and suggestion with scVI batch parameter setting.
Another small question about the ClusterAutoK.stability that my result looks weird as shown, can I ask that is it expected like that trend? Many thanks.
@TengyuZz It can happen, but it depends a lot on the data. For example, it happened to me when I had only one cancer sample and so the biggest difference would be between the tumor niche and the rest. You can take a look at the peak at 7. I know it's much lower but 0.75+ is actually not bad, it's just that the stability at k=2 is very very high.
Let me know if you got nice results :)
Hi @marcovarrone , Thanks for your suggestion, I use k=7 running the analysis, the results is very good, at least match very well with immunofluorescence. We will in-depth discuss about more biology observations. Cellcharter is really a good strategy for spatial clustering. Many thanks for your patience and useful suggestions! :)
Report
Hi, when I running the tutorial for CosMx pipeline with my CosMx data, I stopped at the ClusterAutoK.fit step when I running the code: autok.fit(adata, use_rep='X_cellcharter')
The error message always be : TypeError: sparse array length is ambiguous; use getnnz() or shape[0].
How can I figure it out? Many thanks!
Version information
No response