Illumina / ExpansionHunter

A tool for estimating repeat sizes
Other
182 stars 51 forks source link

Filtering reads by number of spanning reads #142

Open dalmiaa opened 3 years ago

dalmiaa commented 3 years ago

Hi there! I was wondering if EH allows the filtering of outputs by the number of spanning reads that confirm the number of repeats present. For example, I am only looking at intermediate repeat sizes (under 40) of a XXX repeat element which means one read can encompass the full RE. Would EH have any features by which we can filter out a minimum number of spanning reads required to accurately output the result?

egor-dolzhenko commented 3 years ago

Sorry for the late reply! Yes, the number of spanning reads is reported in the VCF file. So you could filter the VCF file with a tool like awk or a Python script. Please feel free to send me an email if I can assist with this.

Also note that you can use our new tool REViewer to visualize reads overlapping repeats of interest.