I am running RepeatModeler on 15 genomes of different size (1 Gb to 3.8 Gb) and I have a segmentation fault (core dumped) (in ltrharvest.log file) for 3 of them. In a previous issues they proposed to split de genome in smaller part so it can be analyzed by repeatModeler and ltrharvest but in my case it's not the biggest genomes that are not working (1.6 Gb, 1.9 Gb and 4.8 Gb). Maybe for the last one it can be because of genome size but for the two other I could make the ltr analysis in larger genomes. So I am not sure splitting can solve the problem. Moreover, in another similar issue I saw that splitting and merging is not the best idea because of redundancies.
Can anyone help me ?
Reproduction steps
Steps to reproduce the behavior, including the command lines given to the program
nohup RepeatModeler -database db -pa N -LTRStruct >& output.log 2>&1 &
LTR Structural Analysis
Running LtrHarvest...LtrPipeline: GenomeTools failed to run ltrharvest. Error code: 9109504
LtrPipeline: Ltrharvest returned an unexpected result line:
Segmentation fault (core dumped)
LTRPipeline: No results returned from LTR structural finder ( LtrHarvest ).
LTRPipeline Time: 396:12:17 (hh:mm:ss) Elapsed Time
Environment (please include as much of the following information as you can find out):
How did you install RepeatModeler? bioconda
Which version of RepeatModeler do you have? RepeatModeler - 2.0.1
Which version of RepeatMasker is this RepeatModeler installation using? RepeatMasker - 4.1.2-p1
Have you installed RepBase RepeatMasker Edition for RepeatMasker, or the full Dfam database? No, I am using a clade specific Repbase version as a library for RepeatMasker.
Operating system and version. The output of uname -a and lsb_release -a can be used to find this.
Linux 4.15.0-158-generic #166-Ubuntu SMP Fri Sep 17 19:37:52 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Describe the issue
I am running RepeatModeler on 15 genomes of different size (1 Gb to 3.8 Gb) and I have a segmentation fault (core dumped) (in ltrharvest.log file) for 3 of them. In a previous issues they proposed to split de genome in smaller part so it can be analyzed by repeatModeler and ltrharvest but in my case it's not the biggest genomes that are not working (1.6 Gb, 1.9 Gb and 4.8 Gb). Maybe for the last one it can be because of genome size but for the two other I could make the ltr analysis in larger genomes. So I am not sure splitting can solve the problem. Moreover, in another similar issue I saw that splitting and merging is not the best idea because of redundancies. Can anyone help me ?
Reproduction steps
Log output RepeatModeler Version 2.0.1 Search Engine = rmblast 2.10.0+ Dependencies: TRF 4.09, RECON , RepeatScout 1.0.6, RepeatMasker 4.1.2 LTR Structural Analysis: Enabled ( GenomeTools 1.6.1, LTR_Retriever , Ninja 0.95-cluster_only, MAFFT 7.475, CD-HIT 4.8.1 ) Random Number Seed: 1625566314
LTR Structural Analysis Running LtrHarvest...LtrPipeline: GenomeTools failed to run ltrharvest. Error code: 9109504 LtrPipeline: Ltrharvest returned an unexpected result line: Segmentation fault (core dumped) LTRPipeline: No results returned from LTR structural finder ( LtrHarvest ). LTRPipeline Time: 396:12:17 (hh:mm:ss) Elapsed Time
Environment (please include as much of the following information as you can find out):
How did you install RepeatModeler? bioconda
Which version of RepeatModeler do you have? RepeatModeler - 2.0.1
Which version of RepeatMasker is this RepeatModeler installation using? RepeatMasker - 4.1.2-p1
Have you installed RepBase RepeatMasker Edition for RepeatMasker, or the full Dfam database? No, I am using a clade specific Repbase version as a library for RepeatMasker.
Operating system and version. The output of
uname -a
andlsb_release -a
can be used to find this. Linux 4.15.0-158-generic #166-Ubuntu SMP Fri Sep 17 19:37:52 UTC 2021 x86_64 x86_64 x86_64 GNU/LinuxAdditional context Thank you for your help !