nioo-knaw / epiGBS2

This is the epiGBS2 snakemake pipeline as published in a preprint version.
MIT License
2 stars 6 forks source link

Looking for coverage data #16

Open Hypecoum opened 2 years ago

Hypecoum commented 2 years ago

Dear developers,

Thanks for a great software package to analyse epiGBS data efficiently.

I have recently been running the pipeline on one of my datasets and would like to calculate the read coverage for each assembled fragment for downstream filtering of assembled loci. Could you please help me indicate where in the output I would be able to find such information?

My first guess was that I would be able to find it in the "alignment" directory, however, I could not find any documentation on the contents of this output directory. Could you please explain me what is in the .bam files in this directory as well?

I have been running the pipeline in reference mode.

Many thanks, Yannick Woudstra

MaartenPostuma commented 2 years ago

Hi Yannick, The easiest way to do it would be to look at the .vcf file that's output after SNP calling or load the methylation calling data using a R package such as methylkit. Here you can find the coverage for each SNP / methylation site.

The bam file is relatively complicated (see https://samtools.github.io/hts-specs/SAMv1.pdf for more info on the format), however the program samtools can be used to extract all sorts of information from these files.

Furthermore using the reference mode, fragments do not get assembled. Instead they are mapped directly onto reference genome, therefore the pipeline will only output the coverage on each location on the reference genome.

Hope this helps, Maarten

Hypecoum commented 2 years ago

Dear Maarten,

Thanks so much for your helpful answer. I will certainly check the methylation calling data in R as you suggested.

The output you mentioned in your last comment about reference mode is exactly the data that I require. I wish to have the coverage of reads on each part of the genome that is covered by the epiGBS experiment. Could you please tell me where I find this information?

Many thanks again, Yannick

MaartenPostuma commented 2 years ago

Hi Yannick, You can calculate it with samtools coverage YOUR_BAM_FILE.bam Greetings, Maarten