Closed chelisa closed 1 year ago
You cannot provide a *.h5 to RepeatMasker using the "-lib" option. Currently "-lib" only supports FASTA and HMM formatted files. Are you simply trying to run RepeatMasker with the default distributed set of libraries? I am nots sure which organism "pelo_genomic" represents but the typical command line would (using the distributed libraries) would look something like this:
./RepeatMasker -species "my species" pelo_genomic.fa
If you have a custom library of repeats ( in FASTA or HMM format ) then you would use "-lib" instead of "-species":
./RepeatMasker -lib mylib.fa pelo_genomic.fa
Be careful with the use of "-nolow" as this doesn't simply exclude simple repeats from the output but avoids running any searches for tandem/simple repeats at all. By not searching for tandem/simple repeats in a competitive fashion with the TE sequences you greatly increase your false positive matching to TE sequences. Ie. a poly-A tandem sequence might incorrectly get labeled as a LINE or SINE sequence in primates etc. For users who simply don't want to see simple/tandem annotations in the output I recommend just filtering them out afterwards ( e.g cat results.fa.out | grep -v "Simple_repeat" > filtered_results.out ).
Let me know if you still have any questions.
but, the repeatmasker directory has a Library subdirectory?