Open zuodabin opened 1 year ago
Yes, the input genomes must be masked with RepeatMasker. This is annoying, and something we hope to provide better automation for next year.
If there's no library for your species, then RepeatModeler would make sense. WindowMasker may also help . Which species are you working on?
Thanks for your timely reply. The species I use needs to use RepeatModeler to build the repeat sequence library, which is not difficult, but after Repeatmasker, the ATCG in masked.fa generated can be replaced by N, will it have any impact? What do you recommend?
The sequence must be softmasked (ie set to lower case), not replaced by N.
Cactus comes with a tool to softmask fastas using BED regions: cactus_fasta_softmask_intervals.py
You can also use the ".out" file from RepeatMasker to softmask a 2bit sequence file with twoBitMask -type=.out.
You can convert between fasta and 2bit with https://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/faToTwoBit https://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/twoBitToFa
Thanks a lost!!
HI ,thinks for your software now ,I want to align 3 genome ,but it not be masked by Repeatmasker, should i need to maske genome? Can I use repeatmodeler and repeatmasker to mask the genome? Ends up using .fa.masked files, right?