vpc-ccg / sedef

Identification of segmental duplications in the genome
MIT License
26 stars 8 forks source link

any pipeline to filter out my SD outcome #14

Closed jxlabWzZ closed 4 years ago

jxlabWzZ commented 4 years ago

hi ,its a nice tools to call SD; while i cannot find the pipeline for quality control in README file; because i call SD from a reference genomes (sus scrofa) along with sedef pipeline with default parameters, and i got nearly 9 million SD regions in final.bed files; does it confidence? or could you provide a series of threshold of the parameters for further filter ? very thanks

inumanag commented 4 years ago

Hi @jxlabWzZ

Is your genome repeat masked? Too many calls usually means that the genome is not repeat masked, and SEDEF ends up reporting low complexity repeats as SDs.

inumanag commented 4 years ago

Closing due to inactivity; if you are still experiencing these problems, please let me know.