TobyBaril / EarlGrey

Earl Grey: A fully automated TE curation and annotation pipeline
Other
130 stars 19 forks source link

Multiple sample consensus library/polymorphic insertion annotation #24

Closed swomics closed 1 year ago

swomics commented 2 years ago

Hi,

One of the features I found useful with EDTA was the script for combining consensus repeat libraries across multi-sample datasets. Is there a way to combine Earl Grey results from assemblies of different individuals, or better yet use a Pangenome graph or VCF file as the input?

I produced a fairly simple wrapper script for annotating a VCF with a multi-sample consensus repeat library. Just using RepeatMasker to annotate the inserted sequences in a "pangenome" VCF (i.e. variants called with assembly vs assembly alignments). https://github.com/swomics/VCF_TE_annotate. Maybe this could be tweaked to become a module?

Cheers, Sam

TobyBaril commented 1 year ago

Outside current scope as main role is to improve discovery and curation of novel TE families, archiving and added to potential queue for future development.