nloyfer / wgbs_tools

tools for working with Bisulfite Sequencing data while preserving reads intrinsic dependencies
Other
125 stars 33 forks source link

issue when use `beta_to_450k` #56

Open axxxxx08 opened 8 months ago

axxxxx08 commented 8 months ago

Hi, thanks for developing this useful tool!

However, I had some problems when I wanted to convert my WGBS data to 850K array data, I trying to use beta_to_450k to do that, but there were some errors.

The command is:

~/software/wgbs_tools/wgbstools beta_to_450k -o test --EPIC ./PR10200180.sorted.beta

The warning message is:

Invalid input argument
Input file is None

Looking forward to your reply. Thanks!!

nloyfer commented 6 months ago

Hi, This error message is sent from the file path validation method. My guess is it can't find the dictionary file mapping CpG index to Illumina EPIC ids. This is a table that looks like this:

$ gunzip -c supplemental/hg19.ilmn2CpG.tsv.gz | head -5 | column -t
cg00000029  21697085  450
cg00000103  6692682   850
cg00000108  4760438   450
cg00000109  5813687   450
cg00000155  10606281  850

Can you see if these files are present? what reference genome are you using? If it's hg19, then see if this file exists: ~/software/wgbs_tools/supplemental/hg19.ilmn2CpG.tsv.gz if it's hg38, then: ~/software/wgbs_tools/supplemental/hg38.ilmn2CpG.tsv.gz

Your reference directory should contain a symbolic link to this file, e.g.

$ ls -l references/hg19/ilmn2CpG.tsv.gz
lrwxrwxrwx 1 nloyfer compbio 39 Aug  3  2021 references/hg19/ilmn2CpG.tsv.gz -> ../../supplemental/hg19.ilmn2CpG.tsv.gz
axxxxx08 commented 4 months ago

Thank you for your reply. I discovered that the issue was caused by my reference genome, which was based on CHM13. wgbs_tools is a great tool, but I would like to know if it allows the addition of other reference genomes. Is it possible to incorporate alternative reference genomes into wgbs_tools?