getzlab / MutSig2CV

MutSig2CV from Lawrence et al. 2014
Other
30 stars 8 forks source link

Computing background mutation rates from MutSig2CV output on TCGA data #17

Open uthsavc opened 2 years ago

uthsavc commented 2 years ago

Hi,

I want to use the background mutation rates (BMRs) from MutSig2CV and I was wondering how to obtain those. I am currently looking at the output of MutSig2CV on TCGA data, e.g. at this link , and I have a few questions.

  1. How do I obtain the BMRs \mu_{g,c,p} (e.g. as in the supplement of this Nature 2013 paper) from the MutSig2CV output? I was not sure which file has those. (I guess I would want the variables x{g,c,p} and X{g,c,p}, and could compute \mu{g,c,p} = x{g,c,p} / X_{g,c,p} ?)

  2. In the Methods of the Lawrence et al 2014 paper, which is cited in the Github, in the "power analysis" section there is a quantity f_g, the gene-specific mutation rate factor. How do we get f)g from the output of MutSig2CV?

  3. Is there any way to compute the number of "covered bases" per gene? Looking at the MutSig2CV output on the TCGA COADREAD cohort, each sample has a total number N_ind of covered coding bases. However the sum of the coding lengths codelen of all genes is larger than N_ind. So where in the results can I get the number of coding "covered bases" per gene?

Thank you very much!