I want to bring your attention to an issue I’ve encountered recently.
The main function mutationTime has a pre-defined input parameter isWgd=classWgd(cn), which calls the function getPloidy(cn) in order to compute sample ploidy. However ,getPloidy function requires columns for copy_number and total_cn variables. These columns/variables are not mentioned as a requirement in the instructions, and ploidy estimates are always zero without those. For example, using the provided example data, one can show:
All samples are labelled as non-WGD as a conclusion of this, which affects the timings of CNVs for WGD samples. Of course, it can be fixed by adding those columns to the cn object before using it on the mutationTime function. However, this is not the default setting, and is not advertised. As such, it might lead to different conclusions. In particular, I have tried timing CNVs for a WGD cohort, with and without adding those columns, and observed significantly lesser NA values in total for the former case.
https://github.com/gerstung-lab/MutationTimeR/blob/e4e266a494482face03e831322a9df00b1b0ff93/R/MutationTime.R#L97
Hi,
Thank you for the amazing software!
I want to bring your attention to an issue I’ve encountered recently.
The main function mutationTime has a pre-defined input parameter isWgd=classWgd(cn), which calls the function getPloidy(cn) in order to compute sample ploidy. However ,getPloidy function requires columns for copy_number and total_cn variables. These columns/variables are not mentioned as a requirement in the instructions, and ploidy estimates are always zero without those. For example, using the provided example data, one can show:
library(MutationTimeR) data(MutationTimeR) getPloidy(bb) [1] 0
All samples are labelled as non-WGD as a conclusion of this, which affects the timings of CNVs for WGD samples. Of course, it can be fixed by adding those columns to the cn object before using it on the mutationTime function. However, this is not the default setting, and is not advertised. As such, it might lead to different conclusions. In particular, I have tried timing CNVs for a WGD cohort, with and without adding those columns, and observed significantly lesser NA values in total for the former case.
Am I missing something?
Thanks,
Ismail