andymckenzie / DGCA

Differential Gene Correlation Analysis
44 stars 10 forks source link

Differential significance across differential correlation classes? #9

Closed RegnerM2015 closed 1 year ago

RegnerM2015 commented 1 year ago

Hi @andymckenzie,

Thank you for developing this amazing resource!

I am working on translating DGCA to a similar use case, but instead correlating the expression of genes, I am computing correlations between gene expression and peak accessibility (as measured by open chromatin profiling techniques such as ATAC-seq). So far, my results look promising.

However, I have noticed that the "+/-" and "-/+" differential correlation classes typically have higher differential correlation Z scores and are typically "more statistically significant" relative to the other differential correlation classes:

Screen Shot 2023-01-24 at 4 15 52 PM Screen Shot 2023-01-24 at 4 16 21 PM
classes <- dCorClass(corsB = p2g.2$Correlation,
                           corsA = p2g.1$Correlation,
                           pvalsB = p2g.2$FDR,
                           pvalsA = p2g.1$FDR,
                           dCorPVals = pvals_res_adj,
                           sigThresh = 1e-2,
                           corSigThresh = 1e-2,
                           convertClasses = T)

Based on my interpretation of my code above, I used 0.01 as the threshold for calling the correlation in each condition statistically significant and as the threshold for calling a differential correlation. Based on how a peak-gene pair passes these thresholds, it is sorted into one of the nine differential correlation classes or is labeled "NonSig" if the differential correlation test FDR p-value >0.01.

My interpretation of the "+/-" and "-/+" differential correlation classes typically having higher differential correlation Z scores and lower p-values, is that this may arise naturally due to the natural difference in magnitude in a "+/-" or "-/+" scenario relative to a "+/0" or "-/0" scenario. In other words, having a positive correlation in condition A with a negative correlation in condition B would more often lead to large differential correlation Z scores simply b/c they are on both sides of 0. Whereas the "+/0" scenario would require a more drastic difference in Z score to achieve similar values to those observed in "+/-" or "-/+".

Please feel free to provide feedback or correct any misunderstandings that I may have. If you could suggest some ideas or changes to my analysis, that would be greatly appreciated!

andymckenzie commented 1 year ago

Hi RegnerM2015,

Thank you for using DGCA! Thanks as well for your observations regarding the "+/-" and "-/+" differential correlation classes. It sounds like you have a good understanding of the code and the results that you are seeing.

I think that your interpretation of the results is correct. The "+/-" and "-/+" classes typically have higher differential correlation Z scores and lower p-values because the magnitude of the difference between the two correlations is greater than in the "+/0" or "-/0" scenarios.

I can't think of any suggestions for changes to your analysis. It probably depends on precisely what your goals are, which might be beyond the scope of just a discussion of the software per se.

I hope this is helpful and best of luck with your research.

RegnerM2015 commented 1 year ago

Thanks @andymckenzie!

I reached out on LinkedIn to hopefully continue our discussion.