Closed cnova1126 closed 7 years ago
Dear cnova,
thank you for the bug report. Can you tell me which version of chromstaR you are using? You can find this by typing sessionInfo() in R. I made some major changes to the code three weeks ago to implement new functionality and probably missed to change something in the plotting function.
Regards, Aaron
Hi Aaron,
Sure no problem. chromstaR_1.0.0
In regards to my other post: I am also trying the new cell mark file and will let you know how it works. Thanks for your help!
Best, Chris
Hi Chris,
Bioconductor has released a new version recently: http://bioconductor.org/packages/release/bioc/html/chromstaR.html This is chromstar 1.2.0 and might solve the issue that you observed. If you do install the new version, make sure to rerun the complete analysis from scratch. Also, the function plotFoldEnrichment() has been renamed to plotEnrichment(). See all changes here: http://bioconductor.org/packages/release/bioc/news/chromstaR/NEWS
Best, Aaron
Hi Aaron,
I have updated chromstaR to version 1.2.0 and am still getting an error regarding combinatorial-plotting. While I get some errors, plotEnrichment seems to work with both counts and peaks, but not with combinations. I have attached the entire code in case I am making an error regarding obtaining annotations, etc.
library(chromstaR) library(biomaRt)
inputfolder <- ("/Volumes/TCGA_CCLE_SKCM/TCGA_CCLE_SKCM_Pipeline_Runs_Apr2017/SKCM_FINAL_RUNS/TCGA-CCLE_downsampled_BAM/") outputfolder <- ("/Volumes/TCGA_CCLE_SKCM/TCGA_CCLE_SKCM_Pipeline_Runs_Apr2017/SKCM_FINAL_RUNS/TCGA-CCLE_downsampled_BAM/TCGA-SKCM_chromstaR_Combo_RAS/")
H3K4me3_H3K27me3 <- read.table(file = "TCGA_SKCM_H3K4me3andH3K27me3_BRAF.txt", sep = "\t", header = TRUE)
Chromstar(inputfolder, experiment.table = H3K4me3_H3K27me3, outputfolder = outputfolder, mode = 'combinatorial', assembly = 'hg19')
model <- get(load(file.path(outputfolder,'multivariate', 'multivariate_mode-combinatorial_condition-RAS.RData')))
genomicFrequencies(model)
heatmapTransitionProbs(model) + ggtitle('Transition Probability')
heatmapCountCorrelation(model) + ggtitle('Read Count Correlation')
ENS <- useMart('ENSEMBL_MART_ENSEMBL') listDatasets(ENS) ENS <- useMart('ENSEMBL_MART_ENSEMBL', dataset = "hsapiens_gene_ensembl")
genes <- getBM(attributes = c('ensembl_gene_id', 'chromosome_name', 'start_position', 'end_position', 'strand', 'gene_biotype'), mart = ENS)
genes <- GRanges(seqnames = paste0('chr',genes$chromosome_name), ranges = IRanges(start = genes$start_position, end = genes$end_position), strand = genes$strand, biotype = genes$gene_biotype)
plotEnrichment(hmm = model, annotation = genes, bp.around.annotation = 15000, what = 'combinations') + ggtitle("Fold Enichment within Gene Body") + xlab('Distance from Gene Body in bp')
plotEnrichment(model, genes, region = 'start', what = 'combinations') + ggtitle("Fold Enichment Around TSS") + xlab('Distance from TSS in bp')
plotFoldEnrichment(model, genes, region = 'start') + facet_wrap(~ combination)
tss <- resize(genes, width = 3, fix = 'start') plotEnrichCountHeatmap(model, tss) + theme(strip.text.x = element_text(size = 6)) + ggtitle('Read counts around TSS')
biotype <- split(tss, tss$biotype) plotFoldEnrichHeatmap(model, annotations = biotype) + coord_flip() + ggtitle('Fold enrichment within Biotype')
I definitely have bivalent targets in this dataset as I have identified using other methods. Also, looking at genomic frequency there seems to be a high level of bivalent enrichment.
$frequency
[] [H3K4me3] [H3K27me3] [H3K4me3+H3K27me3]
0.681691040 0.066196097 0.005565173 0.246547690
Thanks again for your help!
Best, Chris
Hi Chris, I have checked the function and cannot find a bug. However, I have noticed the following things in your plots: H3K4me3-RAS-rep2 and H3K4me3-RAS-rep4 don't seem to have enrichment at TSS. Are these indeed replicates in condition RAS, or do they belong to a different condition? Also, you do have enrichment of bivalent domains at the TSS. However, the enrichment is quite weak. This is not an error, because what the plot shows is the ratio of observed over expected enrichment. In your case, you have a high abundance of bivalent domains, and they seem to be everywhere, not just at TSS, therefore the fold enrichment is quite weak.
Best, Aaron
Hi Aaron, The example is from older data and I have subsequently removed these samples. And yes, I knew I had a large amount of bivalent domains in my dataset, but it is interesting to know the majority are not localized within the TSS or gene body. I will try these functions with another dataset (i.e Active Enhancers) and let you know how it works.
Thanks again! Best, Chris
Dear Aaron,
In addition to an issue with binning, I am also having a problem regarding the plotFoldEnrichment function. I have attached both the code and associated plot for both counts and combinations. I am not sure why H3K4me3 is displayed properly on the count-based plot but not on the combination-based plot? Thanks again for you time and assistance!
plotFoldEnrichment(hmm = model, annotation = genes, bp.around.annotation = 15000, what = counts) + ggtitle("Fold Enichment within Gene Body") + xlab('Distance from Gene Body in bp')
plotFoldEnrichment(hmm = model, annotation = genes, bp.around.annotation = 15000, what = combinations) + ggtitle("Fold Enichment within Gene Body") + xlab('Distance from Gene Body in bp')