Lotfollahi-lab / nichecompass

End-to-end analysis of spatial multi-omics data
https://nichecompass.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
35 stars 6 forks source link

MERFISH sample without GPs #66

Closed Dana162001 closed 5 months ago

Dana162001 commented 6 months ago

Hi thank you for creating such an interesting tool I think it will be really helpful for spatial data analysis. I have two quick questions: 1) Where can I find omnipath_lr_network.csv? (Couldn't find it in the repo) 2) I want to run NicheCompass on MERFISH sample without GPs. So according to the instruction ' If you do not want to mask gene expression reconstruction, you can create a mask of 1s that allows all gene program latent nodes to reconstruct all genes' I created a mask of 1s but it led to the error: RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated

Have you experienced something similar, if yes how did you solve this problem? Thank you in advance for your answer!

sebastianbirk commented 6 months ago

Hi @Dana162001, Regarding your first question, extract_gp_dict_from_omnipath_lr_interactions() has a parameter called 'load_from_disk'. You can set this to False, then the file will be downloaded. For your second question, if you make sure that the mask of 1s has the same shape as the gp mask, it should work. You can create the default gp mask and compare it. Let me know if that does not work.

Dana162001 commented 6 months ago

Hi @sebastianbirk, thank you for your quick reply! I tried both following exactly your tutorial (https://nichecompass.readthedocs.io/en/latest/tutorials/notebooks/mouse_cns_single_sample.html) and creating masks of 1s with numpy masks1s = np.ones((adata.n_vars, 1)). But unfortunately, I keep getting this error while trying to initialize model: RuntimeError Traceback (most recent call last) Cell In[65], line 2 1 # Initialize model ----> 2 model = NicheCompass(adata, 3 counts_key=counts_key, 4 adj_key=adj_key, 5 gp_names_key=gp_names_key, 6 active_gp_names_key=active_gp_names_key, 7 gp_targets_mask_key=gp_targets_mask_key, 8 gp_targets_categories_mask_key=gp_targets_categories_mask_key, 9 gp_sources_mask_key=gp_sources_mask_key, 10 gp_sources_categories_mask_key=gp_sources_categories_mask_key, 11 latent_key=latent_key, 12 conv_layer_encoder=conv_layer_encoder, 13 active_gp_thresh_ratio=active_gp_thresh_ratio, 14 )

File ~/anaconda3/envs/nichecompass/lib/python3.9/site-packages/nichecompass/models/nichecompass.py:320, in NicheCompass.init(self, adata, adata_atac, counts_key, adj_key, gp_names_key, active_gp_names_key, gp_targets_mask_key, gp_targets_categories_mask_key, targets_categories_label_encoder_key, gp_sources_mask_key, gp_sources_categories_mask_key, sources_categories_label_encoder_key, ca_targets_mask_key, ca_sources_mask_key, latent_key, cat_covariates_embeds_keys, cat_covariates_embeds_injection, cat_covariates_keys, cat_covariates_no_edges, genes_idx_key, target_genes_idx_key, source_genes_idx_key, peaks_idx_key, target_peaks_idx_key, source_peaks_idx_key, gene_peaks_mask_key, recon_adj_key, agg_weights_key, include_edge_recon_loss, include_gene_expr_recon_loss, include_chrom_access_recon_loss, include_cat_covariates_contrastive_loss, gene_expr_recon_dist, log_variational, node_label_method, active_gp_thresh_ratio, active_gp_type, n_fc_layers_encoder, n_layers_encoder, n_hidden_encoder, conv_layer_encoder, encoder_n_attention_heads, encoder_use_bn, dropout_rate_encoder, dropout_rate_graph_decoder, cat_covariates_cats, n_addon_gp, cat_covariates_embeds_nums, include_edge_kl_loss, **kwargs) 310 raise ValueError("Please specify an adequate " 311 "´gp_sources_mask_key´ for your adata object. " 312 "The sources mask needs to be stored in " (...) 316 " that allows all gene program latent nodes to" 317 " reconstruct all genes.") ... 322 torch.tensor(self.adata.X.sum(0))[0])) 324 # Retrieve chromatin accessibility masks 325 if adata_atac is None:

RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated

sebastianbirk commented 6 months ago

You can create the masks of 1s as follows:

adata.varm["nichecompass_gp_targets"] = np.ones((adata.n_vars, 200))
adata.varm["nichecompass_gp_sources"] = np.ones((adata.n_vars, 200))

where 200 is the dimension of your latent space if you set n_addon_gp to 0.

Let me know whether that resolves your issue :)

Dana162001 commented 6 months ago

I did as you suggested but the error is still the same. Sorry, I am probably missing something obvious but I can not see what exactly :(. I also tried on two different MERFISH datasets and it is the same for both of them. Can it be an issue with MERFISH data?

RuntimeError Traceback (most recent call last) Cell In[103], line 2 1 # Initialize model ----> 2 model = NicheCompass(adata, 3 counts_key=counts_key, 4 adj_key=adj_key, 5 gp_names_key=gp_names_key, 6 active_gp_names_key=active_gp_names_key, 7 gp_targets_mask_key=gp_targets_mask_key, 8 gp_targets_categories_mask_key=gp_targets_categories_mask_key, 9 gp_sources_mask_key=gp_sources_mask_key, 10 gp_sources_categories_mask_key=gp_sources_categories_mask_key, 11 latent_key=latent_key, 12 conv_layer_encoder=conv_layer_encoder, 13 active_gp_thresh_ratio=active_gp_thresh_ratio, 14 n_addon_gp=0 15 )

File ~/anaconda3/envs/nichecompass/lib/python3.9/site-packages/nichecompass/models/nichecompass.py:320, in NicheCompass.init(self, adata, adata_atac, counts_key, adj_key, gp_names_key, active_gp_names_key, gp_targets_mask_key, gp_targets_categories_mask_key, targets_categories_label_encoder_key, gp_sources_mask_key, gp_sources_categories_mask_key, sources_categories_label_encoder_key, ca_targets_mask_key, ca_sources_mask_key, latent_key, cat_covariates_embeds_keys, cat_covariates_embeds_injection, cat_covariates_keys, cat_covariates_no_edges, genes_idx_key, target_genes_idx_key, source_genes_idx_key, peaks_idx_key, target_peaks_idx_key, source_peaks_idx_key, gene_peaks_mask_key, recon_adj_key, agg_weights_key, include_edge_recon_loss, include_gene_expr_recon_loss, include_chrom_access_recon_loss, include_cat_covariates_contrastive_loss, gene_expr_recon_dist, log_variational, node_label_method, active_gp_thresh_ratio, active_gp_type, n_fc_layers_encoder, n_layers_encoder, n_hidden_encoder, conv_layer_encoder, encoder_n_attention_heads, encoder_use_bn, dropout_rate_encoder, dropout_rate_graph_decoder, cat_covariates_cats, n_addon_gp, cat_covariates_embeds_nums, include_edge_kl_loss, **kwargs) 310 raise ValueError("Please specify an adequate " 311 "´gp_sources_mask_key´ for your adata object. " 312 "The sources mask needs to be stored in " (...) 316 " that allows all gene program latent nodes to" ... 322 torch.tensor(self.adata.X.sum(0))[0])) 324 # Retrieve chromatin accessibility masks 325 if adata_atac is None:

RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated

sebastianbirk commented 6 months ago

It can't be an issue with MERFISH data in general, but it may be an issue with the format of your datasets. For me the suggested approach works fine with the tutorial. Can you check that the tutorial with the tutorial dataset runs fine for you? Just to see whether it is related to your datasets. And if yes, could you check whether there are any differences in how the counts are stored and how the gp masks are stored?

KejingDong commented 6 months ago

I meet the same error when I run the function named "NicheCompass". I have already ddd GP mask to data just as the tutorial, but when I run "NicheCompass" to initialize the model, I meet the error.

KejingDong commented 5 months ago

I have solved this problem by modifying the source code. I took line 320 of the nichecompass.py file from:" self.features_scalefactors = torch.concat((torch.tensor(self.adata.X.sum(0))[0], torch.tensor(self.adata. X.um (0))[0]) ", Change it to " self.features_scalefactors = torch.concat((torch.tensor(self.adata.x.um (0)), torch.tensor(self.adata.x.um (0)))) ", It can run smoothly. Is this reasonable?

sebastianbirk commented 5 months ago

This has nothing to do with the GP mask then, right? I think this is because you have adata.X stored as a dense array. If you convert it into a sparse matrix, it should work:

import scipy.sparse as sp
adata.X = sp.csr_matrix(adata.X)

If you want to leave it as a dense array, your approach is fine.

Dana162001 commented 5 months ago

Hi @sebastianbirk, I did as you suggested and did not find any significant differences between the test and our datasets. However, my college tried to install NicheCompass and test it with our data and for him it worked so I guess there is an issue with my conda environment which I will try to figure out. Thank you for your help and I think I can close the issue now :)