Closed GRGong closed 1 year ago
Hi, I answered this question recently on a slack channel ( perhaps you? ). Here is what I wrote:
I recommend using a serial approach For instance run a de novo tool on first species, mask the
second species with the library, de novo on the masked second species, combine the libraries,
and iterate. This would avoid the need to employ a clustering method, certainly avoid biasing
the seed alignments with identical copies, and provide insight into species specificity for
each family.
What do you want to know? I currently have genomes of multiple close-related fish species. I want to create one non-redundant TE library for unified annotation of these genomes, like a "pangenome-lib". Should I directly merge the genome fastas of these species into a single file and process it with RepeatModeler, or is there another method I should use? Looking forward to your reply.
Helpful context
RepeatClassifier
program. Yes.