jordiabante / CpelAsm.jl

Julia module to perform haplotype allele-specific DNA methylation analysis.
https://jordiabante.github.io/CpelAsm.jl/dev/
MIT License
10 stars 0 forks source link

"NOPS" term in the <samplename>_phased_het.cpelasm.gff file #13

Closed nmfad closed 4 months ago

nmfad commented 4 months ago

Hi Jordi

Given below is my understanding of columns from the GFF file which contains the haplotypes being analyzed.

In the GFF file below, the first column is 1) Chromosome 2) Blank 3) HET phased SNV from the phased VCF file 4) Haplotype start 5) Haplotype end 6) Blank 7) Blank 8) Blank 9) This column tells us the number of CpGs and their respective positions within the haplotype. It also tells us if there is CpG site that is heterozygous because of the presence of an SNV on it. Example the third entry below. Can you confirm if I am correct in interpreting the same ??

X . NOPS-1/1 168974 169174 . . . N=6;CpGs=[168982, 168986, 168992, 169037, 169158, 169170] X . 170530-1/2 170430 170559 . . . N=14;CpGs=[170430, 170440, 170455, 170459, 170462, 170468, 170474, 170479, 170503, 170506, 170525, 170536, 170552, 170559] X . 170530-2/2 170559 170855 . . . N=15;CpGs=[170565, 170585, 170598, 170607, 170625, 170631, 170642, 170644, 170670, 170677, 170680, 170685, 170731, 170803, 170819];hetCpGg1=[170751]

Can you explain what does the term NOPS mean, if I had to guess, it means "No phased SNVs" and if that's the case, how is the corresponding haplotype getting generated without the presence of an SNV ??

As always , thanks much for any insight you can provide. One of the reasons I am asking is I would like to profile the number of SNVs associated with "N" haplotypes and number of haplotypes associated with "N" SNVs.

jordiabante commented 4 months ago

That's correct. The windows where there are no phased SNPs are the ones used to generate the null distribution which I alluded to in the previous issue. Hopefully this makes sense!

nmfad commented 4 months ago

Yes that makes sense, thank you !! It makes it clear, I can only use the haplotypes that include the SNVs for further analysis. Thanks again!