Closed TobyBaril closed 1 year ago
Where did you get the TRF binary you are using ( e.g downloaded/compiled etc )? Which software version is TRF? Do you happen to know if the sequence identifiers in this one assembly are particularly long ( e.g see this issue with TRF: https://github.com/rmhubley/RepeatMasker/issues/192 )?
It was trf409.linux64 downloaded from https://github.com/Benson-Genomics-Lab/TRF/releases
I think the issue linked will solve my problem, as this assembly has a long file name. The header names are all >ctg_n from 1-22. I'll change the name of the FASTA and retest and see if this solves the problem.
I can confirm that shortening the name of the FASTA file resulted in a successful run of RepeatMasker (incl. TRF) - Thanks for pointing me to that issue!
Describe the issue
When running RepeatMasker on a specific genome and library, trf fails with the message:
Reproduction steps
Unfortunately, I cannot share the genome assembly, but on inspection it looks like a totally normal FASTA and it has been successfully analysed by other bioinformatics tools.
NOTE: This does not occur if I run TRF directly on the genome assembly, only when TRF is run on the batches of the genome generated by RepeatMasker.
Manual installation from repeatmasker.org
4.1.4
Additional context
This only occurs with this genome, I have successfully run RepeatMasker on all other 21 genomes in my set.