ZW-xjtlu / exomePeak2

Peak calling and differential methylation for MeRIP-Seq
25 stars 5 forks source link

exomepeak2 运行三次重复,padj/ score值出现异常 #17

Open jiansong8811 opened 2 years ago

jiansong8811 commented 2 years ago

非常感谢作者开发出exomePeak2这么好用的包。我在使用过程中发现,一旦在计算三次重复的的输入值时,会出现奇怪的结果,比如 基本所有peak的padj值都为0.998, score值都等于0.000180458。 但是三次重复分开三次运行时,结果却是正常的。下面是我的运行代码,我现在很困惑,希望作者能够帮忙答疑解惑,非常感谢🙏🙏🙏 f1="CK1IP.sorted.bam" f2="CK2IP.sorted.bam" f3="CK3IP.sorted.bam" IP_BAM = c(f1,f2,f3) ff1="CK1_input.sorted.bam" ff2="CK2_input.sorted.bam" ff3="CK3_input.sorted.bam" INPUT_BAM = c(ff1,ff2,ff3) ft1="TL1_IP.sorted.bam" ft2="TL2_IP.sorted.bam" ft3="TL3_IP.sorted.bam" TREATED_IP_BAM = c(ft1,ft2,ft3) fft1="TL1_input.sorted.bam" fft2="TL2_input.sorted.bam" fft3="TL3_input.sorted.bam" TREATED_INPUT_BAM = c(fft1,fft2,fft3)

exomePeak2(bam_ip =IP_BAM, bam_input =INPUT_BAM, bam_treated_input =TREATED_INPUT_BAM, bam_treated_ip =TREATED_IP_BAM, gff_dir ="IRGSP-1.0.gff")

pauram commented 2 years ago

I've come across the same exact problem. I figured out that it was caused by the LFC shrinkage method applied.

When you don't use replicates in the analysis, exomePeak2 will choose Poisson as the Generalized Linear Model for fitting the data (glm_type argument) and there won't be any empirical Bayes shrinkage on log2FC (LFC_shrinkage argument). Thus, the DiffModLog2FC score will simply correspond to the difference between ModLog2FC_control score and ModLog2FC_treated score.

If you do use replicates, then exomePeak2 will choose the DESeq2 GLM by default and will apply apeglm as the LFC shrinkage method. LFC shrinkage is aimed to reduce the FDR. This choice will affect p-value, adjusted p-value and DiffModLog2FC.

I'm still trying to figure out if the use of this shrinkage method could be inappropriate in this case or if, effectively, my m6A-seq data does not show significant results in terms of differential methylation.

Here you will find a list of the arguments and all the possible options: https://rdrr.io/bioc/exomePeak2/man/exomePeak2.html

Please note I'm just a user and these are my own deductions based on information I found online and tests I ran.

ZW-xjtlu commented 2 years ago

用户您好,关于这个问题,我建议您使用1.9.1以上的exomePeak2版本,在最新版本中,有无replicates时的统计检测算法都一致设定成了Poisson GLM。On Jun 5, 2022, at 11:06 PM, jiansong8811 @.> wrote:非常感谢作者开发出exomePeak2这么好用的包。我在使用过程中发现,一旦在计算三次重复的的输入值时,会出现奇怪的结果,比如 基本所有peak的padj值都为0.998, score值都等于0.000180458。 但是三次重复分开三次运行时,结果却是正常的。下面是我的运行代码,我现在很困惑,希望作者能够帮忙答疑解惑,非常感谢🙏🙏🙏 f1="CK1IP.sorted.bam" f2="CK2IP.sorted.bam" f3="CK3IP.sorted.bam" IP_BAM = c(f1,f2,f3) ff1="CK1_input.sorted.bam" ff2="CK2_input.sorted.bam" ff3="CK3_input.sorted.bam" INPUT_BAM = c(ff1,ff2,ff3) ft1="TL1_IP.sorted.bam" ft2="TL2_IP.sorted.bam" ft3="TL3_IP.sorted.bam" TREATED_IP_BAM = c(ft1,ft2,ft3) fft1="TL1_input.sorted.bam" fft2="TL2_input.sorted.bam" fft3="TL3_input.sorted.bam" TREATED_INPUT_BAM = c(fft1,fft2,fft3)exomePeak2(bam_ip =IP_BAM, bam_input =INPUT_BAM, bam_treated_input =TREATED_INPUT_BAM, bam_treated_ip =TREATED_IP_BAM, gff_dir ="IRGSP-1.0.gff")—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.>