Open FADHLyemen opened 3 years ago
@giovp I want to make correlation plot between cell types and the continuous variables stored in .obs
I would say this is not a scanpy question. It is not clear what do you mean by correlation of a categorical variable with multiple categories and a continuous variable. If you have a binary categorical variable, you can calculate Point Biserial Correlation, but for a multicategorical variable you would have to discretize your continuous variable and calculate Chi-squared test. You can also try ANOVA. If you think you know what variables are dependent and independent you can use logistic regression and look at its coefficients or try ANCOVA. some additional information with examples https://datascience.stackexchange.com/questions/893/how-to-get-correlation-between-two-categorical-variable-and-a-categorical-variab
@Koncopd it is a correlation between two continuous variables as celltypes are continuous and age is also continuous. how to correlate X with continuous variables stored in .obs ?
Are celltypes really continuous? How does this variable look like?
for continuous you can do
from scipy.stats import pearsonr
r, _ = pearsonr(adata.obs["celltypes"], adata.obs["age"])
@Koncopd it is the # of celltypes per each cohort or the relative_frequencies per each group:
is it something researchers looking for? or do you think this not good approach as cells depends on how many cells per sample
sc.tools
?sc.pl
?sc.external.*
?...
How to do correlation between celltypes and age in scanpy?