ZhouQiangwei / MethHaplo

allele specific DNA methylation haplotype region
MIT License
13 stars 4 forks source link

Columns with .mr file #4

Open YuanTian1991 opened 2 years ago

YuanTian1991 commented 2 years ago

Hi:

Thanks for developing this nice tool. I want to use it for ASM detection. However, I am not sure what are columns in .mr file means? In the READ file it indicates the last three columns are:

image

However, in the example file in the test folder, the .mr file seems to have more columns, and the first 7 columns seem are not entirely the same as the README indicated.

Best Tian

ZhouQiangwei commented 2 years ago

Sorry, there is a mistake in the instructions. It should be format: chr pos strand context methC coverage methlevel. Thank you for your use.

YuanTian1991 commented 2 years ago

So I just need first 7 columns right? All the rest are not need.

Second thing I found is, in the example file, all the CpG coverage are above 5, do I need to do any filtering for that?


Please review my steps here with IMR90 WGBS data:

  1. Map WGBS fastq file to bam file (I use gemBS). Name it as myData.bam. The genome reference hg38.fa is downloaded from here.
  2. Call CpG methylation out (I use gemBS).
  3. Manually extract the 7 columns for all CpGs >= 5 Coverage, into myData.mr file, indicates CpG methylation ratio.
  4. Run below command:
methHaplo -M asm -a N -m myData.mr -b myData.bam -o myData.output -g hg38.fa

However, I did not get any ASM out. What I ultimately want to do is to make sure my result is the same as your database show here. Then I can use methHaplo in my other projects.

YuanTian1991 commented 2 years ago

The result I get is like this:

 MyData.plus.txt
#chrom  coordinate1 coordinate2 { #MM   #MU #UM #UU }/{#MV  Ref|Var}    pvalue  adjust_pvalue
chr1    874240  874246  0   0   0   23  1.000000    1.000000
******
chr1    910681  910697  0   1   0   17  1.000000    1.000000
******
chr1    916363  916381  0   2   0   10  1.000000    1.000000
chr1    916381  916391  0   1   0   15  1.000000    1.000000
chr1    916391  916419  0   0   0   17  1.000000    1.000000
******
chr1    919585  919591  0   1   0   25  1.000000    1.000000
******
chr1    919812  919829  0   0   0   17  1.000000    1.000000
******
chr1    925987  926008  0   0   0   15  1.000000    1.000000
******
chr1    931816  931846  0   0   0   20  1.000000    1.000000
******
chr1    935421  935429  0   0   0   18  1.000000    1.000000
******
chr1    936350  936389  0   0   0   11  1.000000    1.000000
chr1    936389  936395  0   0   0   24  1.000000    1.000000
chr1    936395  936428  0   0   0   13  1.000000    1.000000
******
chr1    937786  937802  0   0   0   19  1.000000    1.000000
******
chr1    959829  959856  0   0   0   5   1.000000    1.000000

All p value are 1, seems there is no MM at all...