Could not open *.translation file for reading!

Describe the issue When I use RepeatModeler for de novo repeat sequences finding, It said that the program could not open a *.translation file for reading, which was generated in the BuildDatabase step.

I tried Arabidopsis thaliana genome and got no issues, with TAIR10.1 from NCBI

The genome size of the species I used is about 10Gb and I think maybe this is the problem.

Reproduction steps

the command I used for the discovery is

BuildDatabase -name lka sample.fa
nohup RepeatModeler --threads 30 -database lka &

The genome assembly I used for the program is Larix kaempferi

Log output

RepeatModeler Version 2.0.5
===========================
Using output directory = /mnt/annot/repeatm/RM_40.ThuJul41128262024
Search Engine = rmblast 2.14.1+
Threads = 40
Dependencies: TRF 4.09, RECON 1.08, RepeatScout 1.0.6, RepeatMasker 4.1.6
LTR Structural Analysis: Enabled ( GenomeTools 1.6.4, LTR_Retriever v2.9.0,
                                   Ninja , MAFFT 7.471,
                                   CD-HIT 4.8.1 )
Random Number Seed: 1720092502
Database = lka .
  - Sequences = 4655
  - Bases = 13492429495
  - N50 = 15986365
  - Contig Histogram:
  Size(bp)                                                        Count
  -----------------------------------------------------------------------
  78119697-83699528 |                                                   [ 3 ]
  72539866-78119696 |                                                   [ 1 ]
  66960035-72539865 |                                                   [ 2 ]
  61380204-66960034 |                                                   [ 1 ]
  55800373-61380203 |                                                   [ 6 ]
  50220542-55800372 |                                                   [ 6 ]
  44640711-50220541 |                                                   [ 5 ]
  39060881-44640711 |                                                   [ 14 ]
  33481050-39060880 |                                                   [ 14 ]
  27901219-33481049 |                                                   [ 28 ]
  22321388-27901218 |                                                   [ 52 ]
  16741557-22321387 |*                                                  [ 99 ]
  11161726-16741556 |*                                                  [ 151 ]
  5581895-11161725  |***                                                [ 304 ]
  2065-5581895      |************************************************** [ 3969 ]

Storage Throughput = excellent ( 1483.92 MB/s )

Ready to start the sampling process.
INFO: The runtime of RepeatModeler heavily depends on the quality of the assembly
      and the repetitive content of the sequences.  It is not imperative
      that RepeatModeler completes all rounds in order to obtain useful
      results.  At the completion of each round, the files ( consensi.fa, and
      families.stk ) found in:
      /mnt/annot/repeatm/RM_40.ThuJul41128262024/ 
      will contain all results produced thus far. These files may be 
      manually copied and run through RepeatClassifier should the program
      be terminated early.

RepeatModeler Round # 1
========================
Searching for Repeats
 -- Sampling from the database...
   - Gathering up to 40000000 bp
   - Final Sample Size = 40007056 bp ( 40007056 non ambiguous )
   - Num Contigs Represented = 595
   - Sequence extraction : 00:00:03 (hh:mm:ss) Elapsed Time
 -- Running RepeatScout on the sequences...
   - RepeatScout: Running build_lmer_table ( l = 14 )..
   - RepeatScout: Running RepeatScout.. : 2119 raw families identified
   - RepeatScout: Running filtering stage.. 1982 families remaining
   - RepeatScout: 00:03:40 (hh:mm:ss) Elapsed Time
   - Large Satellite Filtering.. : 12 found in 00:00:08 (hh:mm:ss) Elapsed Time
   - Collecting repeat instances...: 00:02:08 (hh:mm:ss) Elapsed Time
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!
Could not open lka.translation file for reading!

Environment (please include as much of the following information as you can find out):

docker

How did you install RepeatModeler? e.g. manual installation from repeatmasker.org, bioconda, the Dfam TE Tools container, or as part of another bioinformatics tool?

I used a docker image of RepeatModeler called TEtools, which is maintained by Dfam-consortium. I used docker pull command to download the image using latest tag.

Which version of RepeatModeler do you have? The output of RepeatModeler without any options will be a help page with the version of the program displayed at the top.

No database indicated

/opt/RepeatModeler/RepeatModeler - 2.0.5
NAME
    RepeatModeler - Model repetitive DNA

SYNOPSIS
      RepeatModeler [-options] -database <XDF Database>

Operating system and version. The output of uname -a and lsb_release -a can be used to find this.

Linux cell-lab 6.8.0-36-generic #36-Ubuntu SMP PREEMPT_DYNAMIC Mon Jun 10 10:49:14 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Dfam-consortium / RepeatModeler

Could not open *.translation file for reading! #248