griffithlab / GenVisR

Genome data visualizations
Creative Commons Zero v1.0 Universal
206 stars 62 forks source link

Using cnFreq with .seg files #383

Closed dannyjomaa closed 2 years ago

dannyjomaa commented 2 years ago

Hello! I'm fairly new to working with copy number data and am trying to summarize copy number calls for a cohort of tumor samples using cnFreq. My data is reported in .seg files and scaled differently than the example data provided in the cnFreq tutorial here, with values ranging from -30 to 5 (I believe because they're reported as log2(CN) - 1 values). Would it be reasonable to still use cnFreq to summarize the cohort and to adjust the CN_low_cutoff and CN_high_cutoff parameters accordingly (e.g. CN_low_cutoff = -1 for deletions and CN_high_cutoff = 0.5 for amplifications)? Thanks so much in advance!

zlskidmore commented 2 years ago

Hey,

so copy neutral would be -1, log2(2/2)-1? and not 0, log2(2/2)? The later is more standard, but unless you are working with a very pure tumor (micro-disected, sorted blood cancer, etc.) you should add some padding in. For example:

for a perfectly pure tumor, loss of one copy would be log2(1/2) = -1, but tumors are rarely 100% pure, I would probably set the low cutoff to log2(1.4/2) or so to account for the impurity, though this number would depend on any purity estimates you have.

Hope this helps

dannyjomaa commented 2 years ago

Thanks so much! That's really helpful. I added in some padding as you suggested and it seems to be working well.