shmashitup commented 2 years ago

Is there a way to order all mutation data by tumor mutation burden from high to low? I have divided my data into four cohorts which segregate my mutational data in the clinical data subplot. I would like to order my mutational data within each of these cohorts by tumor mutational burden from high to low. I am not sure how to do this or if it is possible with this package.

zlskidmore commented 2 years ago

Hi @shmashitup

which function are you using? You could probably re-factor the input data.frame to get what you want

shmashitup commented 2 years ago

I can share my code here: if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")

BiocManager::install("GenVisR") install.packages("reshape2")

mutational data

library(GenVisR)

library(reshape2)

mutationData <- read.delim("EC_Waterfall Plot_Mutation Data.txt") mutationData mutationData <- mutationData[,c("patient", "gene.name", "trv.type", "amino.acid.change")] colnames(mutationData) <- c("sample", "gene", "variant_class", "amino.acid.change") mutation_priority <- as.character(unique(mutationData$variant_class)) mutationColours <- c("nonsense"='#4f00A8', "frame_shift_del"='#A80100', "frame_shift_ins"='#CF5A59', "in_frame_del"='#ff9b34', "duplication"='#750054', "delins"='#A80079', "missense"='#009933', "splice_region"='#ca66ae', "deletion"='#888811')

Create an initial plot

mutationHeirarchy<- c("missense", "nonsense", "frame_shift_ins", "frame_shift_del", "delins", "deletion", "duplication", "splice_region") waterfall(mutationData, fileType = "Custom", variant_class_order=mutationHeirarchy, mainPalette=mutationColours)

tumor mutation burden

mutationBurden <- read.delim("EC_mutationburden.txt")

First, let's look at the sample names in the mutationData and mutationBurden

mutationData$sample mutationBurden$sample

Create the waterfall plot

waterfall(mutationData, fileType = "Custom", variant_class_order=mutationHeirarchy, mainPalette=mutationColours, mutBurden=mutationBurden)

reformat clinical data to long format

clinicalData <- read.delim("EC_Clinical Data.txt") clinicalData_2 <- clinicalData[,c(1,2,3,4,5)] colnames(clinicalData_2) <- c("sample", "Cohort", "MSI Comprehensive", "Sex", "Age") clinicalData_2 <- melt(data=clinicalData_2, id.vars=c("sample")) new_samp_order <- as.character(unique(clinicalData_2[order(clinicalData_2$variable, clinicalData_2$value), ]$sample))

create the waterfall plot

waterfall(mutationData, fileType = "Custom", variant_class_order=c("missense", "nonsense", "frame_shift_ins", "frame_shift_del", "delins", "deletion", "duplication", "splice_region"), mainPalette=mutationColours, mutBurden=mutationBurden, clinData=clinicalData_2, clinLegCol=4, clinVarCol=c('POLE Drivers and Secondary Variant'='#ccbadc', 'POLE Drivers Only'='#9975b9', 'POLE Variants Only'='#7a5d94', 'POLE Potential New Drivers'='#5E5161', '0'='#c2ed67', '1'='#e63a27', 'Male'='#90ddee', 'Female'='#649aa6', '21-30'='#E5E8FF','31-40'='#878cfb', '41-50'='#0022ff', '51-60'='#2d41b9', '61-70'='#3d4780', '71-80'='#3a4061', '81-90'='#000000'), clinVarOrder=c('POLE Drivers and Secondary Variant', 'POLE Drivers Only', 'POLE Potential New Drivers', 'POLE Variants Only', '0', '1', 'Male', 'Female', '21-30','31-40', '41-50', '51-60', '61-70', '71-80', '81-90'), section_heights=c(1, 3, 1), sampOrder = new_samp_order)

shmashitup commented 2 years ago

I don't know much about R (I'm a beginner). When you say refactor the input data.frame do you mean the one for the mutation burden? I thought the waterfall() automatically assigns the tumor mutation burden based on the order of the mutational data. How can I manually correct the order? @zlskidmore

zlskidmore commented 2 years ago

if your using a waterfall plot there should be a parameter called sampOrder where you can give it your samples c("samp1", "samp2") etc.

shmashitup commented 2 years ago

I see! Thank you! This makes sense- I can do that!

On Mon, Dec 13, 2021 at 10:54 AM Zachary Skidmore @.***> wrote:

if your using a waterfall plot there should be a parameter called sampOrder where you can give it your samples c("samp1", "samp2") etc.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/griffithlab/GenVisR/issues/379#issuecomment-992616879, or unsubscribe https://github.com/notifications/unsubscribe-auth/AW3ZH62V6FNW5DXARARQY4LUQYJMHANCNFSM5J4S4VFQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

griffithlab / GenVisR

Ordering Mutational Data by mutburden from high to low within each clinical data (subplot) cohort #379

mutational data

Create an initial plot

tumor mutation burden

First, let's look at the sample names in the mutationData and mutationBurden

Create the waterfall plot

reformat clinical data to long format

create the waterfall plot