Open Flu09 opened 3 weeks ago
Hi @Flu09,
I assume this error happens because you have set build_only
to FALSE, and inplace is by default set to TRUE (yet you have no sce object). I can fix the logic in a following update, but this should work if you set build_only
to TRUE and follow the rest of the tutorial:
https://ccc-protocols.readthedocs.io/en/latest/notebooks/ccc_R/QuickStart.html
Thank you so much. I have a couple of questions if you have time. I am not sure why there is only one tensor dimesion, while there should be 4 dimensions. The liana_wrap should iterate over objects of different conditions or different samples or different datasets? Also, I am wondering how to proceed given than I did the following. Should I convert my seurat merged object into a sce to proceed as many plots necessitate the presence of a sce object.
context_df_dict<-list()
for (samples.name in sort(unique(mergedsamples@meta.data$sample))){
sdata_ = subset(x = mergedsamples, subset = sample == samples.name)
sdata.sce <- Seurat::as.SingleCellExperiment(sdata_)
liana_res_ <- liana_wrap(sce = sdata.sce,
idents_col='celltype',
assay="RNA", # specify the Seurat assay
assay.type = 'logcounts',
expr_prop=0.1,
verbose=T)#, parallelize = T, workers = 30)
liana_aggregate.magnitude_ <- liana_aggregate(liana_res = liana_res_, aggregate_how='magnitude', verbose = F)
# retain only the aggregate magnitude rank score
# and format for input to liana_tensor_c2c function
liana_aggregate.magnitude_<-liana_aggregate.magnitude_[1:5]
colnames(liana_aggregate.magnitude_)<-c('source', 'target', 'ligand.complex', 'receptor.complex', 'magnitude_rank')
liana_aggregate.magnitude_[['sample']] <- samples.name
context_df_dict[[samples.name]] <-liana_aggregate.magnitude_
}
#we will apply some additional preprocessing (transformations/filtering) to the communication scores.
context_df_dict <- liana:::preprocess_scores(context_df_dict = context_df_dict,
outer_frac = 1/3, # Fraction of samples as threshold to include cells and LR pairs.
invert = TRUE, # transform the scores
invert_fun = function(x) 1-x, # Transformation function
non_negative = TRUE, # fills negative values
non_negative_fill = 0 # set negative values to 0
)
tensor <- liana_tensor_c2c(context_df_dict = context_df_dict,
sender_col = "source", # Column name of the sender cells
receiver_col = "target", # Column name of the receiver cells
ligand_col = "ligand.complex", # Column name of the ligands
receptor_col = "receptor.complex", # Column name of the receptors
score_col = 'magnitude_rank', # Column name of the communication scores to use
how='outer', # What to include across all samples
lr_fill = NaN, # What to fill missing LRs with
cell_fill = NaN, # What to fill missing cell types with
lr_sep='^', # How to separate ligand and receptor names to name LR pair
sort_elements = FALSE, # Whether sorting alphabetically element names of each tensor dim. Does not apply for context order if context_order is passed.
use_available = TRUE,
build_only = TRUE, # set this to FALSE to combine the downstream rank selection and decomposition steps all here
)
#Building the tensor using magnitude_rank...
#100%|██████████| 4/4 [00:10<00:00, 2.52s/it]
#This indicates the number of elements in each tensor dimension: (Contexts, LR pairs, Sender cells, Receiver cells)
tensor$shape
#[[1]]
#[1] 4
#[[2]]
#[1] 918
#[[3]]
#[1] 10
#[[4]]
#[1] 10
#This represents the fraction of values that are missing. In this case, missing values are combinations of contexts x LR pairs x Sender cells x Receiver cells that did not have a communication score or were missing in the dataframes.
tensor$missing_fraction()
#[1] 0.8852206
#This represents the fraction of values that are a real zero (excluding the missing values)
tensor$sparsity_fraction()
#[1] 0.0369744
#fraction of excluded elements
tensor$excluded_value_fraction() # Percentage of values in the tensor that are masked/missing
#[1] 0.8852206
tensor <- liana::decompose_tensor(tensor = tensor,
rank = NULL, # Number of factors to perform the factorization. If NULL, it is automatically determined by an elbow analysis
tf_optimization = 'robust', # To define how robust we want the analysis to be.
seed = 0, # Random seed for reproducibility
elbow_metric = 'error', # Metric to use in the elbow analysis
smooth_elbow = FALSE, # Whether smoothing the metric of the elbow analysis
upper_rank=25, # Max number of factors to try in the elbow analysis
init = 'random', # Initialization method of the tensor factorization
svd = 'numpy_svd', # Type of SVD to use if the initialization is 'svd'
factors_only = FALSE,
)
# Estimate standard error
error_average <- tensor$elbow_metric_raw %>%
t() %>%
as.data.frame() %>%
mutate(rank=row_number()) %>%
pivot_longer(-rank, names_to = "run_no", values_to = "error") %>%
group_by(rank) %>%
summarize(average = mean(error),
N = n(),
SE.low = average - (sd(error)/sqrt(N)),
SE.high = average + (sd(error)/sqrt(N))
)
# plot
error_average %>%
ggplot(aes(x=rank, y=average), group=1) +
geom_line(col='red') +
geom_ribbon(aes(ymin = SE.low, ymax = SE.high), alpha = 0.1) +
geom_vline(xintercept = tensor$rank, colour='darkblue') + # rank of interest
theme_bw() +
labs(y="Error", x="Rank")
Hi @Flu09,
I am not sure why there is only one tensor dimension,
not sure if I understand how your tensor is 1 dimension, as your tensor$shape print shows a list of 4 dimensions.
Should I convert my seurat merged object into a sce to proceed as many plots necessitate the presence of a sce object.
Yes, it's best to convert it (LIANA converts it internally either way), you can use the Seurat::as.SingleCellExperiment
function.
In general, I suggest sticking to the tutorials that we provide here: https://ccc-protocols.readthedocs.io/en/latest/notebooks/ccc_R/01-Preprocess-Expression.html
Note that we also show how to convert between objects, etc.
Hope this helps.