Closed MingLi-929 closed 1 year ago
after updating, the picture looks like
after the update, the picture for binning method will be like
peakHeatmap(files[[4]],TxDb = txdb,nbin = 700)
In ArchR, the co-accessibility function based on Cicero is deployed. What advanced function you can provide if you want to add this function in ChIPseeker?
Also, they can do trajectory analysis and provide pseudo-time heatmaps. I think our heatmap functions could also support the plots with their results while they have provided the labels for selected regions/genes. Consider adding the labels in our heatmap functions.
Update the function to plot the heatmap around the gene body regions
library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
peakHeatmap(peak = files[[4]], TxDb = txdb, upstream = NULL, downstream = NULL, by = "gene", type = "body", nbin = 800)
![test2](https://user-images.githubusercontent.com/78794151/194057060-e8342095-aa14-49cf-8975-eb0e0b6c721b.png)
peakHeatmap(peak = files[[4]], TxDb = txdb, upstream = rel(0.2), downstream = rel(0.2), by = "gene", type = "body", nbin = 800)
![test](https://user-images.githubusercontent.com/78794151/194057188-980128c9-b5a8-447c-a55d-0f4eb27e8dde.png)
peakHeatmap(peak = files[[4]], TxDb = txdb, upstream = 1000, downstream = 1000, by = "gene", type = "body", nbin = 800)
![test1](https://user-images.githubusercontent.com/78794151/194057471-9ebd6f90-6c59-4c4e-8714-bfe82bcb057a.png)
Can we plot heatmaps with different gene sets? like the plots with genesX and gene19 in deeptools).
txdb1 <- transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb2 <- unlist(fiveUTRsByTranscript(TxDb.Hsapiens.UCSC.hg19.knownGene))
txdb <- list(gene19 = txdb1, geneX = txdb2)
peakHeatmap_multiple_Sets(peak = files[[4]],
upstream = 1000,downstream = 1000,
by = c("gene19","geneX"),
type = "start_site",
TxDb = txdb,nbin = 800)
sample <- list(files[[4]],files[[5]])
peakHeatmap_multiple_Sets(peak = files[4:5],
upstream = 1000,
downstream = 1000,
by = c("gene19","geneX"),type = "start_site",
TxDb = txdb,nbin = 800)
the proportion of picture can be adjusted automactically
txdb1 <- transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb2 <- unlist(fiveUTRsByTranscript(TxDb.Hsapiens.UCSC.hg19.knownGene))[1:10000,]
txdb <- list(gene19 = txdb1, geneX = txdb2)
peakHeatmap_multiple_Sets(peak = files[[4]],
upstream = 1000,downstream = 1000,
by = c("gene19","geneX"),
type = "start_site",
TxDb = txdb,nbin = 800)
Can we plot heatmaps with different gene sets? like the plots with genesX and gene19 in deeptools).
peak_Profile_Heatmap
function is also created to meet the need of ploting profiling plot and heatmap together, just like deeptools output.
peak_Profile_Heatmap(peak = files[[4]],
upstream = 1000,
downstream = 1000,
by = "gene",
type = "start_site",
TxDb = txdb3,
nbin = 800)
caplot
class is created to debug output of aplot. The only thing it do is to change the align method.
Before changing, the coordinate of heatmap does not match the profiling picture,and the coordinate of heatmap have been changed
after changing the align method
for the need of patch picture i change from creating a new class to overwriting the aplot method
after overwriting the aplot method,
peak_Profile_Heatmap(peak = files[[4]],
upstream = 1000,
downstream = 1000,
by = c("gene19","geneX"),
type = "start_site",
TxDb = txdb,nbin = 800)
peak_Profile_Heatmap(peak = files[4:5],
upstream = 1000,
downstream = 1000,
by = c("gene19","geneX"),
type = "start_site",
TxDb = txdb,nbin = 800)
Also fix issue #200 ,export getAnnoStat
deal with problem in check()
add columns
to annotatePeak() to fix #193
The columns
can help to get the specific columns from database.
From the bug reported from #193, users can not get ENSEMBL even if pass in database
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(EnsDb.Hsapiens.v86)
files <- getSampleFiles()
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
Anno <- (files[[4]], tssRegion=c(-3000, 3000),
TxDb=TxDb.Hsapiens.UCSC.hg19.knownGene,
annoDb="EnsDb.Hsapiens.v86")
> Anno@anno
GRanges object with 1331 ranges and 13 metadata columns:
seqnames ranges strand | V4 V5
<Rle> <IRanges> <Rle> | <character> <numeric>
[1] chr1 815093-817883 * | MACS_peak_1 295.76
[2] chr1 1243288-1244338 * | MACS_peak_2 63.19
[3] chr1 2979977-2981228 * | MACS_peak_3 100.16
[4] chr1 3566182-3567876 * | MACS_peak_4 558.89
[5] chr1 3816546-3818111 * | MACS_peak_5 57.57
... ... ... ... . ... ...
[1327] chrX 135244783-135245821 * | MACS_peak_1327 55.54
[1328] chrX 139171964-139173506 * | MACS_peak_1328 270.19
[1329] chrX 139583954-139586126 * | MACS_peak_1329 918.73
[1330] chrX 139592002-139593238 * | MACS_peak_1330 210.88
[1331] chrY 13845134-13845777 * | MACS_peak_1331 58.39
annotation geneChr geneStart geneEnd geneLength
<character> <integer> <integer> <integer> <integer>
[1] Promoter (2-3kb) 1 803451 812182 8732
[2] Promoter (<=1kb) 1 1243994 1247057 3064
[3] Promoter (<=1kb) 1 2976181 2980350 4170
[4] Promoter (<=1kb) 1 3547331 3566671 19341
[5] Promoter (<=1kb) 1 3816968 3832011 15044
... ... ... ... ... ...
[1327] Intron (uc010nrz.2/2.. 23 135251455 135293518 42064
[1328] Promoter (<=1kb) 23 139173826 139175070 1245
[1329] Promoter (1-2kb) 23 139585152 139587225 2074
[1330] Distal Intergenic 23 139585152 139587225 2074
[1331] Distal Intergenic 24 14517915 14533389 15475
geneStrand geneId transcriptId distanceToTSS SYMBOL GENENAME
<integer> <character> <character> <numeric> <character> <character>
[1] 2 284593 uc001abt.4 -2911 FAM41C FAM41C
[2] 1 126789 uc001aed.3 0 PUSL1 PUSL1
[3] 2 440556 uc001aka.3 0 LINC00982 LINC00982
[4] 2 49856 uc001ako.3 0 WRAP73 WRAP73
[5] 1 100133612 uc001alg.3 0 LINC01134 LINC01134
... ... ... ... ... ... ...
[1327] 1 2273 uc004ezn.2 -5634 FHL1 FHL1
[1328] 1 389895 uc031tkm.1 -320 <NA> <NA>
[1329] 2 6658 uc004fbd.1 1099 SOX3 SOX3
[1330] 2 6658 uc004fbd.1 -4777 SOX3 SOX3
[1331] 2 352887 uc022cji.1 687612 <NA> <NA>
-------
seqinfo: 24 sequences from hg19 genome
after adding columns
, users can get the specific columns from database.
Anno1 <- annotatePeak(files[[4]], tssRegion=c(-3000, 3000),
TxDb=TxDb.Hsapiens.UCSC.hg19.knownGene,
annoDb="EnsDb.Hsapiens.v86",
columns=c("ENTREZID", "GENEID", "SYMBOL", "GENENAME"))
> Anno1@anno
GRanges object with 1331 ranges and 14 metadata columns:
seqnames ranges strand | V4 V5
<Rle> <IRanges> <Rle> | <character> <numeric>
[1] chr1 815093-817883 * | MACS_peak_1 295.76
[2] chr1 1243288-1244338 * | MACS_peak_2 63.19
[3] chr1 2979977-2981228 * | MACS_peak_3 100.16
[4] chr1 3566182-3567876 * | MACS_peak_4 558.89
[5] chr1 3816546-3818111 * | MACS_peak_5 57.57
... ... ... ... . ... ...
[1327] chrX 135244783-135245821 * | MACS_peak_1327 55.54
[1328] chrX 139171964-139173506 * | MACS_peak_1328 270.19
[1329] chrX 139583954-139586126 * | MACS_peak_1329 918.73
[1330] chrX 139592002-139593238 * | MACS_peak_1330 210.88
[1331] chrY 13845134-13845777 * | MACS_peak_1331 58.39
annotation geneChr geneStart geneEnd geneLength
<character> <integer> <integer> <integer> <integer>
[1] Promoter (2-3kb) 1 803451 812182 8732
[2] Promoter (<=1kb) 1 1243994 1247057 3064
[3] Promoter (<=1kb) 1 2976181 2980350 4170
[4] Promoter (<=1kb) 1 3547331 3566671 19341
[5] Promoter (<=1kb) 1 3816968 3832011 15044
... ... ... ... ... ...
[1327] Intron (uc010nrz.2/2.. 23 135251455 135293518 42064
[1328] Promoter (<=1kb) 23 139173826 139175070 1245
[1329] Promoter (1-2kb) 23 139585152 139587225 2074
[1330] Distal Intergenic 23 139585152 139587225 2074
[1331] Distal Intergenic 24 14517915 14533389 15475
geneStrand geneId transcriptId distanceToTSS GENEID
<integer> <character> <character> <numeric> <character>
[1] 2 284593 uc001abt.4 -2911 ENSG00000230368
[2] 1 126789 uc001aed.3 0 ENSG00000169972
[3] 2 440556 uc001aka.3 0 ENSG00000177133
[4] 2 49856 uc001ako.3 0 ENSG00000116213
[5] 1 100133612 uc001alg.3 0 ENSG00000236423
... ... ... ... ... ...
[1327] 1 2273 uc004ezn.2 -5634 ENSG00000022267
[1328] 1 389895 uc031tkm.1 -320 <NA>
[1329] 2 6658 uc004fbd.1 1099 ENSG00000134595
[1330] 2 6658 uc004fbd.1 -4777 ENSG00000134595
[1331] 2 352887 uc022cji.1 687612 <NA>
SYMBOL GENENAME
<character> <character>
[1] FAM41C FAM41C
[2] PUSL1 PUSL1
[3] LINC00982 LINC00982
[4] WRAP73 WRAP73
[5] LINC01134 LINC01134
... ... ...
[1327] FHL1 FHL1
[1328] <NA> <NA>
[1329] SOX3 SOX3
[1330] SOX3 SOX3
[1331] <NA> <NA>
-------
seqinfo: 24 sequences from hg19 genome
here we can see the ENSEMBL id
The figures are pretty weird at first glance. Can you provide a before-and-after comparison? We must confirm that the new implementation produces an identical image as the old one did.
this is the comparison of the heatmap of old version and new version
https://github.com/YuLab-SMU/ChIPseeker/pull/199#issuecomment-1272295000, quite close, but not identical as expected.
the aplot.R is removed, through aplot v=0.1.8
and disorder of the peak in heatmap is corrected
this commit fix bug #201, after updating,
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
files <- getSampleFiles()
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
peakAnno <- annotatePeak(files[[4]],
tssRegion=c(-3000, 3000),
TxDb=txdb, annoDb="org.Hs.eg.db")
plotAnnoPie(peakAnno)
nrow <- nrow(peakAnno@annoStat)
col <- c("#8dd3c7", "#ffffb3", "#bebada",
"#fb8072", "#80b1d3", "#fdb462",
"#b3de69", "#fccde5", "#d9d9d9",
"#bc80bd", "#ccebc5", "#ffed6f")
cols <- col[1:nrow]
plotAnnoPie(peakAnno,col = col[1:nrow])
This pr change the way of plotting heatmap from base system to ggplot system. It adds the function of gradient color by giving
palette
parameter, which is the palette of gradient color.There are two ways to plot the heatmap. The first is giving a created matrix to
tagHeatmap()
fucntion.The second is to use
peakHeatmap()
to do all things in one steps.If users wanna to speed up the process, a binning method can be used by specifing the
nbin
.Since this method use ggplot system, users can use ggplot method to change
a list of peaks can also be plotted.