saeyslab / multinichenetr

MultiNicheNet: a flexible framework for differential cell-cell communication analysis from multi-sample multi-condition single-cell transcriptomics data
GNU General Public License v3.0
112 stars 14 forks source link

get_abundance_expression_info changes category names that causes downstream error #2

Closed muratcemkose closed 1 year ago

muratcemkose commented 2 years ago

head(sce$cellType)

'CD4 Activated Memory T Cells''CD8 Cytotoxic T Cells''CD4 Memory T Cells''CD8 Cytotoxic T Cells''NK Cells''MAIT Cytotoxic T Cells'

senders_oi = c("Plasma Cells","Macrophages","CD1C DCs") receivers_oi = c("Plasma Cells","CD4 Cytotoxic T Cells")

abundance_expression_info = get_abundance_expression_info(sce = sce_cntr, sample_id = sample_id, group_id = group_id , celltype_id = celltype_id, min_cells = min_cells, senders_oi = senders_oi ,receivers_oi = receivers_oi, lr_network = lr_network, batches = batches)

I had been getting errors in the downstream analysis so I started debugging the steps. I went deep into the get_abundance_expression_info function and realized that the information gathered in the celltype_info object is different than the input data. The spaces were filled with dots.

celltype_info$rel_abundance_df

A tibble: 8 × 3

group celltype rel_abundance_scaled

MM CD1C.DCs 0.00100000

SMM CD1C.DCs 1.00100000

MM CD4.Cytotoxic.T.Cells 0.67023857

SMM CD4.Cytotoxic.T.Cells 0.33176143

. . .

Such difference was resulting with empty data frames in the code below.

rel_abundance_df_sender = sender_info$rel_abundance_df %>% dplyr::filter(sender %in% senders_oi) rel_abundance_df_receiver = receiver_info$rel_abundance_df %>% dplyr::filter(receiver %in% receivers_oi) rel_abundance_df_sender_receiver = rel_abundance_df_sender %>% dplyr::inner_join(rel_abundance_df_receiver, by = "group") %>% dplyr::mutate(sender_receiver_rel_abundance_avg = 0.5*(rel_abundance_scaled_sender + rel_abundance_scaled_receiver))

Which in turn also results an empty data frame for the code below, effecting errors in the following steps of the analysis. abundance_expression_info$sender_receiver_info$rel_abundance_df

A tibble: 0 × 6

group sender rel_abundance_scaled_sender receiver rel_abundance_scaled_receiver #sender_receiver_rel_abundance_avg

browaeysrobin commented 2 years ago

Hi @muratcemkose

Thank you for raising this issue. In the next coming days, I will have a look at this.

Moreover, I will also update the vignettes and state clearly there that celltype, sample and group names should be make.names-compatible

anemartinezlarrinaga2898 commented 1 year ago

Im also getting an the down below error when running the get_abundance_expression_info

Error in get_avg_frac_exprs_abund(sce = sce, sample_id = sample_id, celltype_id = celltype_id,  : 
  The levels of the factor SummarizedExperiment::colData(sce)[,celltype_id] should be a syntactically valid R names - see make.names
browaeysrobin commented 1 year ago

Hi @anemartinezlarrinaga2898

Did you run SummarizedExperiment::colData(sce)$celltype_id= SummarizedExperiment::colData(sce)$celltype_id%>% make.names()

I think this might solve your issue given your error message.