Dfam-consortium / RepeatMasker

RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.
Other
230 stars 50 forks source link

identify and mask repeats in a non-reference genome #267

Closed Anees-caas closed 2 months ago

Anees-caas commented 3 months ago

Hi, I want to identify and mask repeats in non-reference genome Prunus persica. the dfam38_full.0.h5.gz file is too heavy, and it takes several mints to download. second, I can't configure the RepeatMasker without Dfam libraries. As my genome is not available in Dfam could I skip this step? second, I want to identify the novel repeats too how could I retained that repeats which are not matching to my reference genome instead of discard?

rmhubley commented 2 months ago

In the next release of RepeatMasker I will be distributing a minimal database that doesn't include any TE families. This will allow users to use RepeatMasker with custom libraries without having to download Dfam right away.