mquinodo / OFF-PEAK

CNV detection tool for WES data
7 stars 0 forks source link

Applying OFF-PEAK to targeted methylation sequencing data #6

Open radinagg opened 2 months ago

radinagg commented 2 months ago

Hi there,

I would like to apply OFF-PEAK to some targeted EM-seq data (obtained using the Twist Methylome Panel). However, since the tool was designed specifically for exome sequencing data, all target regions in my dataset that fall outside of annotated exons get filtered out. Are there any specific modifications to the scripts that you would recommend that would allow me to skip this exon filtering step?

Thanks in advance!

mquinodo commented 2 months ago

Dear Radina,

Normally as you provide the BED files with the covered regions, they should not be filtered out. Did you --targetsBED option with the regions targeted by your methylome panel?

Best, Mathieu

radinagg commented 2 months ago

Hi Mathieu,

Thanks for your reply! Here's how I ran the first two steps of OFF-PEAK:

   

1) Processing of target intervals (Twist Methylome panel)

bash 01_targets-processing.sh --genome hg38 --targets /home/RefFiles/covered_targets_Twist_Methylome_hg38.sorted.bed --name twist-methylome-panel --ref /home/RefFiles/hg38_noAltHla_UCSC.fa

I think this is the point at which non-exonic target regions get filtered out. You can see this in the screenshot below which shows my original target regions (covered_targets_Twist_Methylome_hg38) in comparison to the output of 01_targets-processing.sh (twist-methylome-panel).

image

   

2) Estimation of coverage for on- and off-target regions

bash 02_coverage-count.sh --listBAM bam_list.txt --mosdepth mosdepth --work outputs_dir --targetsBED data/twist-methylome-panel.bed
mquinodo commented 2 months ago

Hi Radina,

Thank you for the details. The fact that they are merged in an "off-target region" does not mean that it is discarded. The off-target region will still be used to investigate if there are CNVs affecting it.

To fully analyze your data as you want, I guess you mean that each target in your BED should be used as an on-target by OFF-PEAK. This is unfortunately not possible at the moment and would require more development.

I am curious to understand why you would use methylation data to find genomic deletions or duplications. If this could be common then I could think to adapt OFF-PEAK for such data.

Best, Mathieu

radinagg commented 2 months ago

Hi Mathieu,

Thanks for your reply. The reason why it's a bit problematic to have these target regions merged with the off-target ones is that this means we can't obtain a realistic estimate of the coverage within our original target / off-target regions. Let me know if you have any alternative suggestions on how to do that. I understand that the downstream analysis and the CNV detection would still work, though.

We are using methylation data for practical reasons - it carries different layers of information, including DNA methylation and structural variant information. There's recent evidence supporting the reliability of CNV calls obtained from EM-seq, check this study as an example: EM-sequencing is suitable to detect methylation, SNVs and CNVs from single sequencing with low-input DNA.

Best, Radina

mquinodo commented 2 months ago

Hi Radina, I contacted you by email to be able to sent you test scripts for that. Best, Mathieu