Here are the features generated for training using the first step script. I noticed that one feature, Offset_frag,was not mentioned in the original article. It may not have been used. So I have some questions: what does Offset_frag refer to, how does it differ from Dist_frag_end, and does it facilitate model training? Thanks for your reply.
Original artical: At each CpG in each fragment in the bam file (CpG point), we can obtain three features: the fragment’slength,theCpG’s distance to the center of that fragment, and the fragment coverage at that particular CpG position in the reference genome.
chr start end readName FragLen Frag_strand methy_stat Norm_Frag_cov baseQ Offset_frag Dist_frag_end methyPrior
chr22 10527860 10527862 SNL144:297:HYWH3BCXY:1:1111:15154:68116 154 - m 1.122016 35 1 1 NaN
chr22 10527872 10527874 SNL144:297:HYWH3BCXY:1:1111:15154:68116 154 - m 1.122016 40 13 13 NaN
chr22 10527878 10527880 SNL144:297:HYWH3BCXY:1:1111:15154:68116 154 - m 1.122016 40 19 19 NaN
chr22 10527939 10527941 SNL144:297:HYWH3BCXY:1:1111:15154:68116 154 - m 1.122016 39 80 74 NaN
chr22 10538977 10538979 SNL144:297:HYWH3BCXY:1:2115:3237:76894 166 - m 1.122016 39 5 5 NaN
c
Dear Prof Liu,
Here are the features generated for training using the first step script. I noticed that one feature,
Offset_frag
,was not mentioned in the original article. It may not have been used. So I have some questions: what doesOffset_frag
refer to, how does it differ fromDist_frag_end
, and does it facilitate model training? Thanks for your reply.Original artical: At each CpG in each fragment in the bam file (CpG point), we can obtain three features: the fragment’slength,theCpG’s distance to the center of that fragment, and the fragment coverage at that particular CpG position in the reference genome.