Comparing cell-cell communication among different conditions

liaojinyue commented 2 years ago

Hi Rongbin,

Thank you for providing such a useful tool. I've tested Mebocost on my personal scRNA-seq dataset, and it worked beautifully. I’m wondering whether you will include functions for direct comparison among different groups/conditions as shown in the figure 5 in the preprint? Could you perhaps provide the script that was used to create these figures?

Thanks, Jason

zhengrongbin commented 2 years ago

Hello, Thank you for your interest in our work! The way I generated figure 5 was by running a prediction for each condition and then identifying highly changed communications based on the index of dispersion of communication score across conditions. I am happy to share that chunk of the script with you, please send an email at Rongbin.Zheng@childrens.harvard.edu, I will send you the script by email. Best, Rongbin

DrZGQ37 commented 2 years ago

Hello, Thank you for your interest in our work! The way I generated figure 5 was by running a prediction for each condition and then identifying highly changed communications based on the index of dispersion of communication score across conditions. I am happy to share that chunk of the script with you, please send an email at Rongbin.Zheng@childrens.harvard.edu, I will send you the script by email. Best, Rongbin

Dear Rongbin, I am glad to see that you could share the script that identified highly changed communications across different conditions. I think MEBOCOST comparing cell-cell communication among different conditions/groups could expand its applicability for more scRNA-seq datasets. I've tested Mebocost on my personal scRNA-seq dataset, and it basically worked well. However, my scRNA-seq dataset include two conditions and I want to compare the cell communications between these conditions. I am very appreciated that if you could send me the scripts that compared cell-cell communication among different conditions/groups by email? My email address is zhuguiqi37@163.com. Thank you very much. Looking forward to your reply.

Best, Guiqi Zhu

zhengrongbin commented 2 years ago

Here is what I used to find condition-specific communications. The idea is to run mebocost by pooling all cells but label each cell type by each condition. In this way, mebocost use same amount of cells for background estimation in permutation testing, and the communication score and significance are comparable.

create a new label of cell group by combining cell type and condition

for example

adata.obs['label'] = adata.obs['cell_type'] + '~' + adata.obs['condition']

run mebocost by giving "label" as cell group column

please check mebocost tutorial for how to run mebocost

cellall_mebo is the object of mebocost result

cellall_mebo = mebocost.load_obj(path = './scBAT_mebocost_allcond.pk')

focus on communications in same condition

commu_res = cellall_mebo.commu_res.copy()

you can modify this chunk of code based on your data, here I write this as my cell group labeled as celltype~condition, keep it if it's same

commu_res['sender_cond'] = [x.split("~")[-1] for x in commu_res['Sender'].tolist()]
commu_res['receiver_cond'] = [x.split("~")[-1] for x in commu_res['Receiver'].tolist()]
commu_res_new = pd.DataFrame()
for c in commu_res['sender_cond'].unique().tolist():
    tmp = commu_res[(commu_res['sender_cond'] == c) & (commu_res['receiver_cond'] == c)]
    commu_res_new = pd.concat([commu_res_new, tmp])

cellall_mebo.commu_res = commu_res_new.copy()

find significant communication in at least one condition

commu_res_new['label'] = commu_res_new['Sender'].apply(lambda x: x.split('~')[0])+'~'+commu_res_new['Metabolite_Name']+'~'+commu_res_new['Sensor']+'~'+commu_res_new['Receiver'].apply(lambda x: x.split('~')[0])
significant = commu_res_new[commu_res_new['permutation_test_fdr'] < 0.05]

significant

commu_res_need = commu_res_new[commu_res_new['label'].isin(significant['label'])]

commu_res_need_mat = commu_res_need.pivot_table(index = 'label', columns = 'sender_cond', values = 'Commu_Score')

Index of dispersion

IOD = commu_res_need_mat.apply(lambda row: np.var(row)/np.mean(row), axis = 1).sort_values(ascending=False)

select top 100 based on IOD

top_n = 100
most_var_commu = commu_res_need_mat.loc[IOD.head(top_n).index]
most_var_commu = pd.concat([most_var_commu,
                            pd.DataFrame(most_var_commu.index.str.split('~').tolist(),
                                         index = most_var_commu.index, 
                                         columns = ['Sender', 'Met', 'Sensor', 'Receiver'])],
                          axis = 1)

zhengrongbin / MEBOCOST