dmcable / spacexr

Spatial-eXpression-R: Cell type identification (including cell type mixtures) and cell type-specific differential expression for spatial transcriptomics
GNU General Public License v3.0
296 stars 72 forks source link

some parameters are set to the CSIDE vignette values, which are intended for testing but not proper execution. #179

Open sopenaml opened 11 months ago

sopenaml commented 11 months ago

Hi,

Thank you very much for developing these great packages. Could you elaborate why the following warning? Warning message: In run.CSIDE.general(myRCTD, X1, X2, barcodes, cell_types, cell_type_threshold = cell_type_threshold, : run.CSIDE.general: some parameters are set to the CSIDE vignette values, which are intended for testing but not proper execution. For more accurate results, consider using the default parameters to this function.

I have a visium data set, where the cell type of interest seems to have very low weights per spot (around between 5-10%) and in few spots. So in order to run.CSIDE.single I have to do:

run.CSIDE.single(rctd.a4, 
                            explanatory.variable, 
                            doublet_mode = FALSE, 
                             cell_types =  c("cell_type2","cell_type1")") , 
                            fdr = 0.05,
                            weight_threshold = 0.07, 
                            cell_type_threshold = 28) 

As you can see, the thresholds required for the analysis are way below the defaults. And I was wondering if this is OK to use of what would you advice? Thank you very much for your time and assistance.

dmcable commented 10 months ago

weight_threshold should be set to at least 75%. This is not the weight of the cell type of interest. It is the total weight of all cell types included in the model. It just ensures that major cell types are not excluded. You may have to include more cell types in the model that are co-occuring with the cell types of interest. In our paper, we successfully analyzed the dentritic cell cell type which appeared in low proportion (5-10%).

The warning message you shared, I am surprised if it would be generated by the parameters included here since none of your parameters match the vignette values.

Hope this explanation makes sense.

Best, Dylan

sopenaml commented 10 months ago

Thank you for your response, I have tried to increase the number of cell types but the error complains that the aren't enough cells. So it's asking to remove cell types or decrease cell_type_threshold, as before, how low would you recommend going? Thank you very much

 rctd.a4 <- run.CSIDE.single(rctd.a4, 
                            explanatory.variable, 
                            doublet_mode = FALSE, 
                             cell_types =  c("AT2 Cell(Lung)",
                                              "AT1 Cell(Lung)", 
                                              "Endothelial cell_Kdr high(Lung)",
                                              "Endothelial cell_Tmem100 high(Lung)",
                                              "Macrophage_Lyz2 high(Ovary)",
                                              "Endothelial cells_Vwf high(Lung)"  ) , 
                            fdr = 0.05,
                            weight_threshold = 0.75, 
                            cell_type_threshold = 100) 

Error in choose_cell_types(myRCTD, barcodes, doublet_mode, cell_type_threshold,  : 
  choose_cell_types: cell types: AT2 Cell(Lung), Endothelial cell_Tmem100 high(Lung), Macrophage_Lyz2 high(Ovary), Endothelial cells_Vwf high(Lung) 
detected using aggregate_cell_types to have less than the minimum cell_type_threshold of 125. 
To fix this issue, please remove these cell types or reduce the cell_type_threshold

If if I reduce cell_type_threshold to 80 it complains of some of the cell types that I'm not interested in:

rctd.a4 <- run.CSIDE.single(rctd.a4, 
                            explanatory.variable, 
                            doublet_mode = FALSE, 
                             # cell_types =  "AT2 Cell(Lung)" , 
                             cell_types =  c("AT2 Cell(Lung)",
                                              "AT1 Cell(Lung)", 
                                              "Endothelial cell_Kdr high(Lung)",
                                              "Endothelial cell_Tmem100 high(Lung)",
                                              "Macrophage_Lyz2 high(Ovary)",
                                              "Endothelial cells_Vwf high(Lung)"  ) , 
                            fdr = 0.05,
                            weight_threshold = 0.75, 
                            cell_type_threshold =80)
Error in choose_cell_types(myRCTD, barcodes, doublet_mode, cell_type_threshold,  : 
  choose_cell_types: cell types: Endothelial cell_Tmem100 high(Lung), Macrophage_Lyz2 high(Ovary), Endothelial cells_Vwf high(Lung) 
detected using aggregate_cell_types to have less than the minimum cell_type_threshold of 80. 
To fix this issue, please remove these cell types or reduce the cell_type_threshold

However if I remove those I get the following:

rctd.a4 <- run.CSIDE.single(rctd.a4, 
                            explanatory.variable, 
                            doublet_mode = FALSE, 
                             # cell_types =  "AT2 Cell(Lung)" , 
                             cell_types =  c("AT2 Cell(Lung)",
                                              "AT1 Cell(Lung)", 
                                              "Endothelial cell_Kdr high(Lung)"), fdr = 0.05,
                            weight_threshold = 0.75, 
                            cell_type_threshold =80)
Warning: run.CSIDE.general: removing the following cell types due to insufficient counts per region. Consider lowering cell_type_threshold or proceeding with removed cell types. Cell types: AT2 Cell(Lung), AT1 Cell(Lung), Endothelial cell_Kdr high(Lung), 
Error in choose_cell_types(myRCTD, barcodes, doublet_mode, cell_type_threshold,  : 
  choose_cell_types: length(cell_types) is 0. Please pass in at least one cell type in the list cell_types

Apologies for the long post, but I'm still confused about what's the best approach to follow.

Thank you very much,

Miriam

dmcable commented 9 months ago

Hi Miriam,

You can reasonably decrease cell_type_threshold to around 25. You can check the amount each cell type appears by using the count_cell_types function. If you are missing a cell type of interest, there are two possible explanations: 1) The cell type doesn't appear on enough pixels,. 2) Pixels containing the cell type are filtered out due to total cell type weight being below weight_threshold. In this case, you may be failing to include other main cell types in the tissue that co-localize with this cell type.

If the issue is number 2, then I would recommend looking at the weight matrix of all pixels containing the cell type of interest and seeing which other cell types co-localize.

Best, Dylan

dmcable commented 9 months ago

Actually, I want to change what I said earlier. This error message is due to not having enough pixels either above or below medv = 0.5 in terms of explanatory.variable. I have updated the code for calculating this, and I would recommend trying again. If you are still experiencing this issue, I would examine explanatory.variable to determine if there are sufficient pixels appearing above and below 0.5.

sopenaml commented 9 months ago

Thank you for your response. I suspect the issue is that I don't have a lot of pixels containing the cells of interest: If I run aggregate_cell_types and order I can see that the number of pixels with cell types are not that high.

                                                   AT1 Cell(Lung) 
                                                      216.1021675 
Mesenchymal stem cell_Tmsb10 high(Mesenchymal-Stem-Cell-Cultured) 
                                                      164.5542977 
                                  Endothelial cell_Kdr high(Lung) 
                                                      106.6363475 
                          Stromal cell_Inmt high(Lung-Mesenchyme) 
                                                      101.7282983 
                                                   AT2 Cell(Lung) 
                                                       80.1426319 
                                      Macrophage_Lyz2 high(Ovary) 
                                                       72.1804122 
                          Megakaryocyte progenitor cell(Placenta) 
                                                       69.6193038 
                              Endothelial cell_Tmem100 high(Lung) 
                                                       60.3121328 
                              Alveolar macrophage_Ear2 high(Lung) 
                                                       40.0815923 
                                              Ciliated cell(Lung) 
                                                       30.6202122 
Axin2+ Myofibrogenic Progenitor cell_Cox4i2 high(Lung-Mesenchyme) 
                                                       30.1414178 
                                 Endothelial cells_Vwf high(Lung) 
                                                       22.7612777 
                                   Dendritic cell_Naaa high(Lung) 
                                                       16.8406218 
 Axin2+ Myofibrogenic Progenitor cell_Acta2 high(Lung-Mesenchyme) 
                                                       14.6077926 
                                                  NK cell(Uterus) 
                                                       13.7533387 
                    B cell_Cd79a&Iglc2 high(Mammary-Gland-Virgin) 
                                                       13.6185076 
                                     Eosinophil granulocyte(Lung) 
                                                       10.6055276 
                            Macrophage_Ace high(Peripheral_Blood) 
                                                       10.4675841 
                Dendritic cell_Siglech high(Mammary-Gland-Virgin) 
                                                        9.7439032 
                                    NK cell(Mammary-Gland-Virgin) 
                                                        7.8200777 
                         T-cells_Ctla4 high(Mammary-Gland-Virgin) 
                                                        6.9163011 
                           Neutrophil_Il1b high(Peripheral_Blood) 
                                                        6.4812259 
                                       T cell_Ms4a4b high(Thymus) 
                                                        5.0174698 
                              Endothelial cell_Cldn5 high(Uterus) 
                                                        4.3938253 
                                                  Club Cell(Lung) 
                                                        2.3286264 
                                               Basophil(Placenta) 
                                                        0.9757773 
dmcable commented 8 months ago

It looks like of the cell types you mentioned above: "AT2 Cell(Lung)", "AT1 Cell(Lung)", "Endothelial cell_Kdr high(Lung)", "Endothelial cell_Tmem100 high(Lung)", "Macrophage_Lyz2 high(Ovary)", and "Endothelial cells_Vwf high(Lung)",

All have reasonable cell type counts (>50) except for "Endothelial cells_Vwf high(Lung)". I believe you should be able to get C-SIDE to run using the following cell types: AT1 Cell(Lung) 216.1021675 Mesenchymal stem cell_Tmsb10 high(Mesenchymal-Stem-Cell-Cultured) 164.5542977 Endothelial cell_Kdr high(Lung) 106.6363475 Stromal cell_Inmt high(Lung-Mesenchyme) 101.7282983 AT2 Cell(Lung) 80.1426319 Macrophage_Lyz2 high(Ovary) 72.1804122 Megakaryocyte progenitor cell(Placenta) 69.6193038 Endothelial cell_Tmem100 high(Lung) 60.3121328

You may have to lower cell_type_threshold a bit.

Best, Dylan