Closed nmfad closed 4 months ago
Hi Numrah,
The steps look fine to me in general. Is there any information that cpelasm prints out? I think your point about the HET SNVs will only affect the alignment rate in step 3, but shouldn't have an effect otherwise...
Are only passing a single sample at a time? That is, b1 and b2 should be a BAM files with haplotype 1 and 2, respectively, for the same sample. It's a little suspicious that there is no prefix in the output file names...
Also, are you able to run the toy example just fine and see the output?
Thanks,
Hi Jordi,
Thanks so much for responding back. Answering your questions below.
Is there any information that cpelasm prints out? - Yes I do see that it prints out information of the format below, looks something like this towards the end. 2024-05-27 19:46:51: No null statistics computed in 999-th attempt ... 2024-05-27 19:46:51: No null statistics computed in 1000-th attempt ... 2024-05-27 19:46:51: No null statistics computed in 1001-th attempt ... 2024-05-27 19:46:51: Exceeded 1000 attempts in comp_tnull ... 2024-05-27 19:46:51: Done with null statistics ... 2024-05-27 19:46:51: Computing p-values in heterozygous loci ... 2024-05-27 19:46:51: Computing p-values Tmml ... 2024-05-27 19:46:51: Computing p-values Tnme ... 2024-05-27 19:46:51: Computing p-values Tpdm ... 2024-05-27 19:46:51: Done with p-values ...
Are only passing a single sample at a time? That is, b1 and b2 should be a BAM files with haplotype 1 and 2, respectively, for the same sample. It's a little suspicious that there is no prefix in the output file names...
Also, are you able to run the toy example just fine and see the output? - Yes I was able to run the Toy example successfully and also see the outputs as follows. No issues there. I am seeing significant p-values generated in the *_pvals.bedgraph files as well. example_het.cpelasm.gff example_hom.cpelasm.gff example_mml1.bedGrap example_mml2.bedGraphh example_nme1.bedGraph example_nme2.bedGraph example_pdm.bedGraph example_tmml_null.bedGraph example_tmml_pvals.bedGraph example_tnme_null.bedGraph example_tnme_pvals.bedGraph example_tpdm_null.bedGraph example_tpdm_pvals.bedGraph
One of the samples I am testing CPEL on is actually validated to be skewed for XCI, so I would expect to see significant pvalues for MML and PDM atleast. But the *_pvals.bedgraph files dont get generated and the following 3 files are empty
Hi Jordi,
A follow up question for you, I am looking at the toy example directory within the CPEL. I am seeing that inside the fasta folder , there are 3 directories .../CpelAsm/uD5CJ/test/fasta 1) n-masked - which I understand is the n-masked reference generated using the same using SNP_split_genome_preparation steps.
2) allele1 - There is an allele1 fasta with the bisulphite genome from bismarck
3) allele2 - There is an allele2 fasta with the bisulphite genome from bismarck
Can you please explain at what step are you aligning allele1 and allele2 reads independently to bismarck ?? I was of the notion that the BISMARCK step is only run using the N-masked reference to align all WGBS reads within a sample, then SNPsplit is used to split the resultant bismarck generated bam into allele1.bam and allele2.bam. Are you aligning allele1.bam and allele2.bam generated from SNPsplit again using bismarck ?? Can you please explain how the files under the following directories are generated .../CpelAsm/uD5CJ/test/fasta/allele1 .../CpelAsm/uD5CJ/test/fasta/allele2 .../CpelAsm/uD5CJ/test/bam/example.a1.bam .../CpelAsm/uD5CJ/test/bam/example.a2.bam
Maybe I am not seeing significant pvalues because there is something I doing incorrectly in the process or I may have misunderstood the methods section of your paper. If you can explain the above, that would be very helpful!
Hi Jordi,
Apologies to bother you , I was able to figure out the following 2 things and that seems to have partially solved some of my issues (No pvalue files getting generated and test statistic bedgraph files are empty). Explaining them below incase someone else here needs it.
1) The run_analysis(b1,b2,b1,vcf,fa,out;win_exp=10,n_null=1000,cov_ths=5) step mentioned in the readme instructions is missing the unassigned_bam argument and a few other arguments which I was able to find on the "Natcomms" branch. The analysis that worked and was eventually able to generate pvalues is as follows. run_analysis(bam1, bam2, bam_unassigned, vcf, fa, out;g_max=g_max, cov_ths=cov_ths, win_exp=win_exp, trim=trim, n_null=1000,n_max=25) I set g_max = 500, cov_ths = 8 , win_exp = 100 and trim = (5,5,5,5). Wanted to let you know so anyone else using the instructions on the main page runs into the same issues or maybe the instructions on the main page can be updated to reflect the inclusion of the unassigned_bam.
2) I noticed that when running the script outside of the Julia compiler on the shell command line as a julia
Hi Jordi,
I am testing samples that have been validated for X-skew testing to show skewed XCI using clinical X-skew testing as well as monoallelic expression using DNA & RNA sequencing.
I am now testing the same samples for allele specific methylation on the X chromosome using CPEL and although I was able to get CPEL to work smoothly with no errors, I am not seeing any significant p-values in the following files