raerose01 / deconstructSigs

deconstructSigs
138 stars 47 forks source link

normalization of hg38 exome #64

Open ChelseaCHENX opened 4 years ago

ChelseaCHENX commented 4 years ago

Hi there!

I am asking a small issue regarding normalization step, tri.counts.method - I am normalizing to hg38 exome but not sure if this can be replaced by the hg19 metrics as the package default dataframe - or should I generate a hg38 version? (If so, how)

A lot of thanks!

DarioS commented 4 years ago

I recommend not using tri.counts.method at all. It exaggerates C to T mutations because they are quite common because of aging or sunlight damage, but there are relatively few locations in the genome where they can happen. So, it artificially makes the numerator big and the denominator small for mutations contributing to Signature1 and Signature 7. You will find that valuable signatures will be wiped out from your results when you use that option that are clearly present when you don't use it.