Closed DrZhaoJie closed 4 years ago
7 days!!! It should take less than 7 mins. There is definitely something fishy going on. Let me have a look and I will get back to you tomorrow..
Hi,
If your aim is to get the MATH score you could directly use math.score
function which is much simpler.
> math.score(maf = melanomamaf)
t_vaf field is missing, but found t_ref_count & t_alt_count columns. Estimating vaf..
Not enough mutations in TCGA-D3-A51H-06A-12D-A25O-08 . Skipping..
Not enough mutations in TCGA-D3-A8GE-06A-11D-A372-08 . Skipping..
Not enough mutations in TCGA-EE-A2ME-06A-11D-A197-08 . Skipping..
Not enough mutations in TCGA-V3-A9ZX-01A-11D-A39W-08 . Skipping..
Not enough mutations in TCGA-V4-A9ED-01A-11D-A39W-08 . Skipping..
Not enough mutations in TCGA-V4-A9F2-01A-11D-A39W-08 . Skipping..
Not enough mutations in TCGA-VD-A8K7-01B-11D-A39W-08 . Skipping..
Not enough mutations in TCGA-VD-A8KF-01A-11D-A39W-08 . Skipping..
Not enough mutations in TCGA-YZ-A980-01A-11D-A39W-08 . Skipping..
Tumor_Sample_Barcode MedianAbsoluteDeviation MATH
1: TCGA-3N-A9WB-06A-11D-A38G-08 6.618352 23.35870
2: TCGA-3N-A9WC-06A-11D-A38G-08 5.833333 31.44909
3: TCGA-3N-A9WD-06A-11D-A38G-08 5.306122 35.04327
4: TCGA-BF-A1PU-01A-11D-A19A-08 6.140835 20.30131
5: TCGA-BF-A1PV-01A-11D-A19A-08 6.658596 23.29800
---
532: TCGA-YZ-A984-01A-11D-A39W-08 7.446809 22.08128
533: TCGA-YZ-A985-01A-11D-A39W-08 10.599361 41.20721
534: TCGA-Z2-A8RT-06A-11D-A372-08 4.861551 18.67459
535: TCGA-Z2-AA3S-06A-11D-A397-08 7.223114 25.03452
536: TCGA-Z2-AA3V-06A-11D-A397-08 4.607046 30.73683
inferHeterogeneity
is to identify subclones in each tumor. In your command you are passing the wrong values for tsb
argument. Below command should work fine and should finish within ten mins.
melanomaresMATH = inferHeterogeneity(maf = melanomamaf, tsb = melanomamaf@variants.per.sample$Tumor_Sample_Barcode)
Wow! Thank you very much! I tried your method and code, and it processed less than 10 minutes! Besides, I found that the math.score calculated from "math.score(maf = melanomamaf)" was similar to that calculated from "melanomaresMATH = inferHeterogeneity(maf = melanomamaf……“. Both the results were ok and may be published in the paper, right?
Dear Prof Thanks for your comment! I've figured out and calculated the MATH sore ignoring variants in copy number altered regions. The results were as follows. I want to know how the "CN-altered" in the "cluster" column was defined according to "CN" column? beyond[-0.3, +0.3]? Thank you very much! Wating for your comment sincerely!
Hello @DrZhaoJie ,
Small discrepancy is probably due to vaf cut off used in the function inferHeterogeneity
. See ?inferHeterogeneity
for more details.
Variants located on segments with CN 1.5 < CN < 2.5 are retained. Rest would be classified as CN altered.
You might also be interested in (sciclone)[https://github.com/genome/sciclone] package which offers advanced options for clonality.
Hi, The MATH calculation in your documentation was based on maf file including nonSynonymous mutations, right? First, were the nonSynonymous mutations you defined based on which literature? Second, was it reasonable to calculate MATH based on maf file including all SNV mutation types like nonSynonymous and Synonymous mutations? Thank you very much!
Hi,
Please read the documentation for ?read.maf
. By default following, variant classifications with High/Moderate variant consequences are considered as non-synonymous.
"Frame_Shift_Del", "Frame_Shift_Ins", "Splice_Site", "Translation_Start_Site","Nonsense_Mutation", "Nonstop_Mutation", "In_Frame_Del","In_Frame_Ins", "Missense_Mutation"
Only non-syn variants are considered for MATH score estimation.
OK, I got it! You really made it very clear! Thank you for your patience!
Dear Prof Thanks for your contribution to maftools R package. When I calculated the MATH scores of all the melanoma samples from TCGA database, it kept processing up to 7 days without output results. My R code was as follows. Could you please help me solve the problem.
R code: SKCMDNA <- GDCquery_Maf("SKCM", pipelines = "muse") SKCMmaf <- SKCMDNA %>% read.maf UVMDNA <- GDCquery_Maf("UVM", pipelines = "muse") UVMmaf <- UVMDNA %>% read.maf melanomamaf = maftools:::merge_mafs(maf = c(SKCMmaf, UVMmaf), verbose = TRUE) melanomamaf@data$t_vaf <- melanomamaf@data$t_alt_count/melanomamaf@data$t_depth melanomaresMATH = inferHeterogeneity(maf = melanomamaf, tsb = melanomamaf@data$Tumor_Sample_Barcode, vafCol = 't_vaf')
It ended with the follwing words: Registered S3 methods overwritten by 'lme4': method from cooks.distance.influence.merMod car influence.merMod car dfbeta.influence.merMod car dfbetas.influence.merMod car Registered S3 method overwritten by 'jmvcore': method from
as.data.frame.Array DelayedArray
Thank you very much!