ZW-xjtlu / exomePeak2

Peak calling and differential methylation for MeRIP-Seq
25 stars 5 forks source link

请教最新版exomePeak2 (1.7)报错 #16

Open ZehongW opened 2 years ago

ZehongW commented 2 years ago

作者您好,我最近用新版exomePeak2的时候报了这样的错: Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK Extract bin features ... OK Count reads on bin features ... OK Identify background features ... OK Estimate sample sepecific size factors from the background ... OK Detect peaks with GLM ... OK Count reads on peaks ... OK Calculate offset matrix for peaks ... OK Detect differentially modified peaks with interactive GLM ... OK Error in .local(object, con, format, ...) : Scores must be non-NA numeric values Calls: exomePeak2 ... callGeneric -> eval -> eval -> export -> export -> .local

以下是我的代码(3个重复): library(exomePeak2)

setwd("/home/wuzh/shMETTL14") GENE_ANNO_GTF = '/home/wuzh/hisat2_index/hg19/hg19.gtf'

control_Input = c('siControl_1_input.bam','siControl_2_input.bam','siControl_3_input.bam') control_IP = c('siControl_1_IP.bam','siControl_2_IP.bam','siControl_3_IP.bam')

shmettl14_Input = c('siMETTL14_1_input.bam','siMETTL14_2_input.bam','siMETTL14_3_input.bam') shmettl14_IP = c('siMETTL14_1_IP.bam','siMETTL14_2_IP.bam','siMETTL14_3_IP.bam')

sep1 <- exomePeak2( bam_ip = control_IP, bam_input = control_Input, bam_ip_treated = shmettl14_IP, bam_input_treated = shmettl14_Input, gff = GENE_ANNO_GTF,

genome = "hg19",

              strandness = "1st_strand",
              parallel = 12,
              save_output = TRUE,
              save_dir = getwd(),
              experiment_name = "exomePeak2_updated_output_shmettl14")

之前用两个重复的时候好像可以跑通,挺疑惑的。希望您能帮我看一下,十分感激!

ZW-xjtlu commented 2 years ago

exomePeak2的使用者您好,这个BUG是源于保存BED文件时出现了值为NA的-log10 values,以至于后续步骤无法运行, 我将于今天之内修复这个问题并通知您。

On May 15, 2022, at 1:49 AM, ZehongW @.**@.>> wrote:

作者您好,我最近用新版exomePeak2的时候报了这样的错: Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK Extract bin features ... OK Count reads on bin features ... OK Identify background features ... OK Estimate sample sepecific size factors from the background ... OK Detect peaks with GLM ... OK Count reads on peaks ... OK Calculate offset matrix for peaks ... OK Detect differentially modified peaks with interactive GLM ... OK Error in .local(object, con, format, ...) : Scores must be non-NA numeric values Calls: exomePeak2 ... callGeneric -> eval -> eval -> export -> export -> .local

以下是我的代码(3个重复): library(exomePeak2)

setwd("/home/wuzh/shMETTL14") GENE_ANNO_GTF = '/home/wuzh/hisat2_index/hg19/hg19.gtf'

control_Input = c('siControl_1_input.bam','siControl_2_input.bam','siControl_3_input.bam') control_IP = c('siControl_1_IP.bam','siControl_2_IP.bam','siControl_3_IP.bam')

shmettl14_Input = c('siMETTL14_1_input.bam','siMETTL14_2_input.bam','siMETTL14_3_input.bam') shmettl14_IP = c('siMETTL14_1_IP.bam','siMETTL14_2_IP.bam','siMETTL14_3_IP.bam')

sep1 <- exomePeak2( bam_ip = control_IP, bam_input = control_Input, bam_ip_treated = shmettl14_IP, bam_input_treated = shmettl14_Input, gff = GENE_ANNO_GTF,

genome = "hg19",

strandness = "1st_strand", parallel = 12, save_output = TRUE, save_dir = getwd(), experiment_name = "exomePeak2_updated_output_shmettl14") 之前用两个重复的时候好像可以跑通,挺疑惑的。希望您能帮我看一下,十分感激!

— Reply to this email directly, view it on GitHubhttps://github.com/ZW-xjtlu/exomePeak2/issues/16, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEU2CIKXETRODQKU56A36ATVJ7RUBANCNFSM5V535UKA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

ZW-xjtlu commented 2 years ago

Hi ZehongW,

The issue has been fixed now, please try again.

On May 15, 2022, at 1:49 AM, ZehongW @.**@.>> wrote:

作者您好,我最近用新版exomePeak2的时候报了这样的错: Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK Extract bin features ... OK Count reads on bin features ... OK Identify background features ... OK Estimate sample sepecific size factors from the background ... OK Detect peaks with GLM ... OK Count reads on peaks ... OK Calculate offset matrix for peaks ... OK Detect differentially modified peaks with interactive GLM ... OK Error in .local(object, con, format, ...) : Scores must be non-NA numeric values Calls: exomePeak2 ... callGeneric -> eval -> eval -> export -> export -> .local

以下是我的代码(3个重复): library(exomePeak2)

setwd("/home/wuzh/shMETTL14") GENE_ANNO_GTF = '/home/wuzh/hisat2_index/hg19/hg19.gtf'

control_Input = c('siControl_1_input.bam','siControl_2_input.bam','siControl_3_input.bam') control_IP = c('siControl_1_IP.bam','siControl_2_IP.bam','siControl_3_IP.bam')

shmettl14_Input = c('siMETTL14_1_input.bam','siMETTL14_2_input.bam','siMETTL14_3_input.bam') shmettl14_IP = c('siMETTL14_1_IP.bam','siMETTL14_2_IP.bam','siMETTL14_3_IP.bam')

sep1 <- exomePeak2( bam_ip = control_IP, bam_input = control_Input, bam_ip_treated = shmettl14_IP, bam_input_treated = shmettl14_Input, gff = GENE_ANNO_GTF,

genome = "hg19",

strandness = "1st_strand", parallel = 12, save_output = TRUE, save_dir = getwd(), experiment_name = "exomePeak2_updated_output_shmettl14") 之前用两个重复的时候好像可以跑通,挺疑惑的。希望您能帮我看一下,十分感激!

— Reply to this email directly, view it on GitHubhttps://github.com/ZW-xjtlu/exomePeak2/issues/16, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEU2CIKXETRODQKU56A36ATVJ7RUBANCNFSM5V535UKA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

ZehongW commented 2 years ago

作者您好,无法输出结果的bug修复了,十分感谢。但是输出的结果IP组和input的结果似乎放反了,比如:

RPM.IP.Control RPM.input.Control RPM.IP.Treated RPM.input.Treated geneID diff.log2FC pvalue fdr score

1.3 4.21 0.22 4.63 ENSG00000000457.9 -3.14652 6.15E-05 0.000467 4.211304 导致后面差异的对数变化反了

ZW-xjtlu commented 2 years ago

Diff.log2FC是没法从RPM中直接推算出来的,因为实际peak的size factor的定义要复杂的多。 出现某一个peak中反过来是常见的,判断出没出大问题需要看全部peak上的correlation。

目前在我跑的例子中是没发现这种大问题的。

On May 18, 2022, at 7:14 PM, ZehongW @.**@.>> wrote:

作者您好,无法输出结果的bug修复了,十分感谢。但是输出的结果IP组和input的结果似乎放反了,比如:

RPM.IP.Control RPM.input.Control RPM.IP.Treated RPM.input.Treated geneID diff.log2FC pvalue fdr score

1.3 4.21 0.22 4.63 ENSG00000000457.9 -3.14652 6.15E-05 0.000467 4.211304 导致后面差异的对数变化反了

— Reply to this email directly, view it on GitHubhttps://github.com/ZW-xjtlu/exomePeak2/issues/16#issuecomment-1129879634, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEU2CILWRDUB2SPC27QKAE3VKTGKHANCNFSM5V535UKA. You are receiving this because you commented.Message ID: @.***>

ZehongW commented 2 years ago

感谢解答!

hlldhy commented 1 year ago

你好,关于我目前有一份数据为对照组和METTL3甲基化蛋白被敲除的MeRIP-seq数据样本,按道理讲,METTL3甲基化蛋白被敲除的MeRIP-seq数据样本中m6A的水平要比对照组要低,但是出来的结果却是METTL3甲基化蛋白被敲除的MeRIP-seq数据样本中m6A的水平要比对照组要高很多很多,因此我去查看了MACS2的差异peak的结果:METTL3甲基化蛋白被敲除的MeRIP-seq数据样本中m6A的水平要比对照组要低。这个该如何解释呢?

hlldhy commented 1 year ago

作者您好,无法输出结果的bug修复了,十分感谢。但是输出的结果IP组和input的结果似乎放反了,比如:

RPM.IP.Control RPM.input.Control RPM.IP.Treated RPM.input.Treated geneID diff.log2FC pvalue fdr score

1.3 4.21 0.22 4.63 ENSG00000000457.9 -3.14652 6.15E-05 0.000467 4.211304 导致后面差异的对数变化反了

这种情况我这里数据中有不少这种情况

hlldhy commented 1 year ago

chr | chromStart | chromEnd | name | strand | blockCount | blockSizes | blockStarts | RPM.IP.Control | RPM.input.Control | RPM.IP.Treated | RPM.input.Treated | geneID | diff.log2FC | pvalue | fdr | score

11 | chr12 | 125626656 | 125626831 | 11 | + | 1 | 175 | 0 | 4.65 | 5.62 | 2.83 | 6.86 | AACS | 1.104841855 | 5.86E-26 | 1.43E-25 | 25.23227465 我的结果中出现了这种情况,就是 diff.log2FC该为负值,但是给出的结果为正值,不知道是什么原因

ZehongW commented 1 year ago

chr | chromStart | chromEnd | name | strand | blockCount | blockSizes | blockStarts | RPM.IP.Control | RPM.input.Control | RPM.IP.Treated | RPM.input.Treated | geneID | diff.log2FC | pvalue | fdr | score

11 | chr12 | 125626656 | 125626831 | 11 | + | 1 | 175 | 0 | 4.65 | 5.62 | 2.83 | 6.86 | AACS | 1.104841855 | 5.86E-26 | 1.43E-25 | 25.23227465 我的结果中出现了这种情况,就是 diff.log2FC该为负值,但是给出的结果为正值,不知道是什么原因

应该是有问题的 我暂时没有用这个版本,等待后续bug修复

ZW-xjtlu commented 1 year ago

用户您好,一切应该以diff.log2FC为准,因为它是有normalize过GC content bias和IP efficiency changes的。 前面的IP与input的RPM只normalize过了sequencing depth,因此会有和diff.log2FC对不上的情况。

2023年3月14日 上午11:08,ZehongW @.***> 写道: