im3sanger / dndscv

dN/dS methods to quantify selection in cancer and somatic evolution
GNU General Public License v3.0
212 stars 48 forks source link

A wrong parameter in substmodel "192r_3w"? #70

Closed kakiuchi-kyt closed 3 years ago

kakiuchi-kyt commented 3 years ago

Dear Authors,

"dndscv" is an excellent package to calculate dN/dS value!

I found a possible error in the code. The default sm parameter is "192r_3w", and dndscv function reads "submod_192r_3w.rda" which contain a table named "substmodel". The table will be used to calculate the mutation rate. However, the last row of the table, which is for TTT>TGT substitution, has only "t", while other rows have "tNNN>NNN". I think the last row should be; ` TTT>TGT "tTTT>TGT" "tTTT>TGTwmis" "tTTT>TGTwnon" "tTTT>TGTwspl" instead of; TTT>TGT "t" "twmis" "twnon" "t*wspl"`

Is this an error, or am I wrong?

Please let me know.

Sincerely, Nobuyuki Kakiuchi

im3sanger commented 3 years ago

Hello Nobuyuki Kakiuchi,

Thank you for your message and your interest in dndscv.

I can confirm that this is not an error. The NNN>NNN rate parameters in the substitution model are relative rates, relative to one of them. TTT>TGT was arbitrarily chosen as reference and so has a fixed value of 1, which means that it can be excluded from the substitution matrix (in fact it needs to be excluded as otherwise there are more parameters than can be estimated from the data). In other words, if you are interested in the absolute rate of TTT>TGT changes per TTT site, the value is "t". If you are interested in the absolute rate of any other trinucleotide change, the value is "t*NNN>NNN".

I hope this helps.

Best, Inigo