Lotfollahi-lab / nichecompass

End-to-end analysis of spatial multi-omics data
https://nichecompass.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
35 stars 6 forks source link

Can I run NicheCompass with zebrafish Visium data? #64

Closed hkeremcil closed 6 months ago

hkeremcil commented 7 months ago

In the paper, I remember it saying that NicheCompass can be applied to spatial transcriptomics data at both single cell resolution and spot resolution. However, I think it was not tested with Visium data. Is it possible to use it with Visium data? I have both unannotated and unprocessed Visium data as well as Visium data after applying cell2location and deconvolution. Should I use the annotated and deconvolved data? Or is there another requirement for using spot resolution data? Because even after deconvolution, the structure looks significantly different. There aren't single cells in the adata.obs rows. There are only spots. And the spots have cell abundances in adata.obsm but it of course does not say which cell belongs to which cell type.

Lastly, the provided data only includes human_mouse_gene_orthologs.csv, mouse networks, and mouse metabolite enzymes & sensors. Do you have a suggestion about where to find the zebrafish versions?

Many thanks in advance!

sebastianbirk commented 7 months ago

Hi @hkeremcil,

Yes, it is also possible to use it with Visium data. We have tested it with Visium data as well, and, in the manuscript, we use a multimodal dataset that also has spot-level resolution. You just need to be aware that in the downstream analyses (e.g. when you create the cell-cell communication networks), you can't interpret it as networks of cells but rather networks of spots.

For the mapping of orthologs, you can use Ensembl and follow this documentation.

hkeremcil commented 6 months ago

Hello @sebastianbirk,

Thank you very much for your response. This is insightful and it is good to know that it will work with Visium data. However, a very obvious problem emerges when we use spot-resolution data. The rows in adata.obs do not have cell types because they are not cells but instead spots. But the code requires a 'Main_molecular_cell_type' column in adata.obs, specifically in this part:

cell_type_colors = create_new_color_dict(
    adata=adata,
    cat_key=cell_type_key)

It gives an error if that column is missing. Spots cannot have cell types, as far as I know, since they contain multiple cells. How did you solve this issue when you used that multimodal dataset with spot-level resolution? Do I need to change the source code or are there different functions for data with spot-level resolution?

Since I applied the cell2location pipeline, I have cell type abundance values for each spot and cell type. I can use the most abundant cell types in each spot to fill the "Main_molecular_cell_type" column, but that eliminates most of the cell types, especially the immune cells we are interested in, and it only leaves a few abundant cell types. So, I am assuming there must be a different solution.

Thanks so much for your attention and support!

sebastianbirk commented 6 months ago

This code is not part of the model training workflow, so you could just skip it. To train NicheCompass, no labels are required. If you want to visualize the data (for which this code snippet is meant), then you would have to create some labels on spot level. This could be the majority cell type or also some more fine-grained label that you retrieve based on the spot's cell type abundance. Hope that helps.

hkeremcil commented 6 months ago

Thank you very much @sebastianbirk, I hadn't realized that it wasn't required for the training. That solves the problem! I do have another problem but I will close this issue and ask about it in a new one.

I genuinely appreciate the support and guidance!