Closed pelutz closed 3 years ago
Thank you! This is a good catch. On March 26th, 2020 (about 11 months ago now) we released a new version which contained a bug fix. This fix improved the CpG callings in bins with certain types of reads. The sample data in the repo's last commit was updated 15 months ago. This sample data was generated prior to this new fix and will contain differences.
Thank you for pointing this out. I will update this sample data and add it to the repo. The results you are getting when you run the data should be correct. The sample data is just old.
Sample data has been updated on the master branch. If you find another discrepancy please feel free to comment here or open a new issue.
Great, thanks for the fast response! I can confirm that we now have exactly identical results, for both the coverage and clustering functions.
Describe the bug I installed CluBCpG in a Conda environment on a linux server, and was able to run the test_Module.py successfully:
But when I apply the clubcpg-coverage command to the A_test.chr19.bam file, I get a different output (with 222 lines) from the one available on GitHub (at: https://github.com/waterlandlab/CluBCpG/tree/master/SampleData/COVERAGE/CompleteBins.A_test.chr19.bam.chr19.csv - this file has 562 lines), with some missing bins and different numbers of reads or even CpGs for some bins: chr19_3079700,2,3 chr19_3079800,13,2 chr19_3080000,2,8 chr19_3080100,16,1 chr19_3080200,5,8 chr19_3080300,16,1 chr19_3080400,24,1 chr19_3080500,5,1 chr19_3080800,4,1 chr19_3081300,12,1
I see 2 possible explanations: 1) Clubcpg does not interact properly with samtools in my installation. Does the test_Module evaluate this interaction? 2) The SampleData and COVERAGE files on GitHub do not match?
Thanks in advance for your help, PE
To Reproduce clubcpg-coverage -a /b/home/path/CluBCpG/SampleData/A_test.chr19.bam -o /b/home/path/tests/ --bin_size 100 -chr chr19 --read1_5 0 --read1_3 0 --read2_5 0 --read2_3 0