Closed hmyh1202 closed 2 years ago
Hi @hmyh1202 , thanks for your interest! pacbio_cpg_tools is great and has high performance. ccsmeth has a difference model architecture compared to primrose and pacbio_cpg_tools. However, at present we haven't trained a stable model of ccsmeth. I will release the model ASAP.
Best, Peng
Pacbio_cpg_tools only output very few CpG sites ~30M, the bismark for NGS 30x data can get >40M CpG site. So, what`s the number of your tools can got in your test ? Thank you!
------------------ 原始邮件 ------------------ 发件人: "PengNi/ccsmeth" @.>; 发送时间: 2022年6月20日(星期一) 下午2:46 @.>; @.**@.>; 主题: Re: [PengNi/ccsmeth] how about cpg_tools of Pacbio (Issue #18)
Hi @hmyh1202 , thanks for your interest! pacbio_cpg_tools is great and has high performance. ccsmeth has a difference model architecture compared to primrose and pacbio_cpg_tools. However, at present we haven't trained a stable model of ccsmeth. I will release the model ASAP.
Best, Peng
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
We also only output 30M CpGs, and 30M is the number of all the CpGs in human when only CpGs in forward strand are counted. I think the reason that bismark outputs 40M CpGs, is that it treats the CpGs in reverse strand as different sites from the CpGs in forward strand, and outputs all of them if they are all covered by reads. However, in mammals, in most cases, the methylation status of cytosines at CpG on both DNA strands are symmetric( both methylated or both unmethylated). So it is ok only outputting CpGs in forward strand.
Tanks for your reply!
Yes, Bismark treats the CpGs in reverse strand as different from the forward strands, both forward and reverse cytosines are reported. Because semi-methylation of CpG dinucleotide also a point to disscuss.
Another question, how can I caculate the methylation level of a region base on the 30M site? In general, total mC count/(total mC count +total unC count ) is caculate for a region, or mC site num/(all C number of reference), so which method should select or any other method ?
Thank you.
------------------ 原始邮件 ------------------ 发件人: "PengNi/ccsmeth" @.>; 发送时间: 2022年6月20日(星期一) 下午3:18 @.>; @.**@.>; 主题: Re: [PengNi/ccsmeth] how about cpg_tools of Pacbio (Issue #18)
We also only output 30M CpGs, and 30M is the number of all the CpGs in human when only CpGs in forward strand are counted. I think the reason that bismark outputs 40M CpGs, is that it treats the CpGs in reverse strand as different sites from the CpGs in forward strand, and outputs all of them if they are all covered by reads. However, in mammals, in most cases, the methylation status of cytosines at CpG on both DNA strands are symmetric( both methylated or both unmethylated). So it is ok only outputting CpGs in forward strand.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
From my point, as we can give a binary label (0/1) for each CpG at each read, I think the total mC count/(total mC count +total unC count ) at reads level
is a common method for measuring methylation level for a region.
Fine, the 2st method is a mC density level. Thank you!
Hello,
what is the different between Pacbio CpG tools and your ccsmeth software? And where is the models can I get when using --model_file /path/to/ccsmeth/models/model.ckpt
Thank you!