Open sam-israel opened 3 years ago
Hi, I agree that it is not good to have median negative values. With ABIS, we tried to deconvolute as many cell types as possible, and this increases the possibility of getting negative values. Reducing the results to 5-8 cell types only is a good strategy if you care about quality and not quantity. Hence, here is a series of solutions you could try out:
On Tue, 1 Jun 2021 at 09:00, sam-israel @.***> wrote:
For the sake of using the proportions generated by ABIS in cell type aware differential expression (TOAST package), I was recommended to merge similar cell types into 5-8 types only.
- Is there a more smart way of doing that rather than simply summing together ABIS output proportions ?
The summary of ABIS generated proportions is
Monocytes_C NK T_CD8_Memory Min. : 2.949 Min. :-3.9268 Min. : 1.674 1st Qu.: 6.622 1st Qu.: 0.9884 1st Qu.: 5.747 Median : 7.677 Median : 2.2476 Median : 8.797 Mean : 8.137 Mean : 2.4485 Mean :10.647 3rd Qu.: 9.610 3rd Qu.: 3.7038 3rd Qu.:13.069 Max. :15.787 Max. : 9.7480 Max. :48.491 T_CD4_Naive T_CD8_Naive Min. :-2.411 Min. :-25.8218 1st Qu.: 4.159 1st Qu.: -3.7227 Median : 6.610 Median : -0.4212 Mean : 7.563 Mean : -1.3039 3rd Qu.: 9.806 3rd Qu.: 2.6580 Max. :25.564 Max. : 9.5094 B_Naive T_CD4_Memory MAIT Min. : 0.5531 Min. :-4.018 Min. :-0.359 1st Qu.: 3.1769 1st Qu.: 1.671 1st Qu.: 2.001 Median : 4.6171 Median : 4.162 Median : 3.772 Mean : 5.3058 Mean : 4.119 Mean : 3.942 3rd Qu.: 7.2463 3rd Qu.: 6.179 3rd Qu.: 5.590 Max. :15.4163 Max. :12.769 Max. :11.857 T_gd_Vd2 Neutrophils_LD T_gd_non_Vd2 Min. :-2.938 Min. :16.87 Min. :-9.113 1st Qu.: 2.382 1st Qu.:41.87 1st Qu.:-4.120 Median : 3.447 Median :50.85 Median :-2.589 Mean : 3.644 Mean :50.70 Mean :-2.560 3rd Qu.: 4.760 3rd Qu.:59.91 3rd Qu.:-1.491 Max. :12.175 Max. :80.54 Max. : 8.840 Basophils_LD Monocytes_NC_I Min. : 0.6517 Min. :-1.2083 1st Qu.: 2.6431 1st Qu.: 0.2907 Median : 4.6014 Median : 0.9748 Mean : 6.4039 Mean : 1.2425 3rd Qu.: 8.5597 3rd Qu.: 1.8747 Max. :46.8233 Max. : 6.1912 B_Memory mDCs Min. :-7.6440 Min. :-0.07160 1st Qu.:-1.7759 1st Qu.: 0.08102 Median :-0.8056 Median : 0.13453 Mean :-1.0176 Mean : 0.14015 3rd Qu.:-0.0758 3rd Qu.: 0.19162 Max. : 3.5879 Max. : 0.46562 pDCs Plasmablasts Min. :0.02063 Min. :0.0358 1st Qu.:0.14775 1st Qu.:0.1508 Median :0.20857 Median :0.2183 Mean :0.22740 Mean :0.3574 3rd Qu.:0.28736 3rd Qu.:0.3611 Max. :0.62854 Max. :3.9839
As you can see, the median for three cell types (T_CD8_Naive, T_gd_non_Vd2, and B_memory) is negative. It seems reasonable to set all negative values to zero (and remove T_CD8_Naive from the analysis, due to its low minimum values).
However, an additional source of proportions (based on methylations data) is available for me for comparison's sake. I summed together similar cell types (T_CD8_Naive with T_CD8_Memory, B_Naive with B_Memory) and look at the correlation between the external source of proportions and the ABIS generated one.
The correlation is actually better if I do not set all negative values to zero. Hence, my question is :
- Does it make sense to sum together negative and positive proportions, when merging similar cell types into one?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/giannimonaco/ABIS/issues/18, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC2UTEG5RKVOBWVPIWTNJ5LTQSAR5ANCNFSM454AZ2OA .
So am I correct in understanding that averaging the signature matrix columns is preferable to summing the output? Could you say a bit in what sense & why
Reducing the results to 5-8 cell types only is a good strategy if you care about quality and not quantity.
On a different subject - another de-convolution I am performing is on microarray data; the values range from 0.25 to 13. Is it an acceptable range? Is it preferable to filter out the low values (that can indicate simply noise), or to apply some other procedure to deal with too low/too high values?
Thanks.
On Thu, 3 Jun 2021 at 10:15, sam-israel @.***> wrote:
- So am I correct in understanding that averaging the signature matrix columns is preferable to summing the output? Could you say a bit in what sense & why
Reducing the results to 5-8 cell types only is a good strategy if you care about quality and not quantity.
- On a different subject - another de-convolution I am performing is on microarray data; the values range from 0.25 to 13. Is it an acceptable range? Is it preferable to filter out the low values (that can indicate simply noise), or to apply some other procedure to deal with too low/too high values?
Thanks.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/giannimonaco/ABIS/issues/18#issuecomment-853678936, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC2UTEGHU7LYPAGY5FCHGS3TQ422HANCNFSM454AZ2OA .
For the sake of using the proportions generated by ABIS in cell type aware differential expression (TOAST package), I was recommended to merge similar cell types into 5-8 types only.
As you can see, the median for three cell types (T_CD8_Naive, T_gd_non_Vd2, and B_memory) is negative. It seems reasonable to set all negative values to zero (and remove T_CD8_Naive from the analysis, due to its low minimum values).
However, an additional source of proportions (based on methylations data) is available for me for comparison's sake. I summed together similar cell types (T_CD8_Naive with T_CD8_Memory, B_Naive with B_Memory) and look at the correlation between the external source of proportions and the ABIS generated one.
The correlation (between ABIS merged cell types to the external source) is actually better if I do not set all negative values to zero. Hence, my question is :