Dfam-consortium / RepeatMasker

RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.
Other
226 stars 49 forks source link

*** buffer overflow detected ***: trf terminated sh: line 1: 3142675 Aborted (core dumped) #202

Closed TobyBaril closed 1 year ago

TobyBaril commented 1 year ago

Describe the issue

When running RepeatMasker on a specific genome and library, trf fails with the message:

*** buffer overflow detected ***: trf terminated
sh: line 1: 3142675 Aborted                 (core dumped)

Reproduction steps

Unfortunately, I cannot share the genome assembly, but on inspection it looks like a totally normal FASTA and it has been successfully analysed by other bioinformatics tools.

NOTE: This does not occur if I run TRF directly on the genome assembly, only when TRF is run on the batches of the genome generated by RepeatMasker.

RepeatMasker -lib customLibrary.fasta -cutoff 400 -norna -lcambig -s -a -dir . $genome
RepeatMasker -species eukarya -cutoff 400 -norna -lcambig -s -a -dir . $genome

Manual installation from repeatmasker.org

4.1.4

Additional context

This only occurs with this genome, I have successfully run RepeatMasker on all other 21 genomes in my set.

rmhubley commented 1 year ago

Where did you get the TRF binary you are using ( e.g downloaded/compiled etc )? Which software version is TRF? Do you happen to know if the sequence identifiers in this one assembly are particularly long ( e.g see this issue with TRF: https://github.com/rmhubley/RepeatMasker/issues/192 )?

TobyBaril commented 1 year ago

It was trf409.linux64 downloaded from https://github.com/Benson-Genomics-Lab/TRF/releases

I think the issue linked will solve my problem, as this assembly has a long file name. The header names are all >ctg_n from 1-22. I'll change the name of the FASTA and retest and see if this solves the problem.

TobyBaril commented 1 year ago

I can confirm that shortening the name of the FASTA file resulted in a successful run of RepeatMasker (incl. TRF) - Thanks for pointing me to that issue!