Closed GunzIvan28 closed 4 years ago
Hi! My first guess is that RepeatModeler output file is not generated. Can you check it?
Hey, Yes it fails after RepeatModeler round3 in a folder labelled 'RM_26613....'. What could be my way out for a successful run? The .fna was generated after annotating my genome with prokka
What do you use as input, is it a genome in fasta format?
Hi Andrew, ERR987781.zip
That is the link to the file i used. According to the instructions, the pipeline required a .fna file which i acquired after doing annotation. However, I also have a an assembly file in fasta format that i obtained from de novo assembly. Coud using the latter resolve the issue?
Yes, you should use the genomic assembly as input.
Hey, I have tried to re-run it with a fasta file from assembly and it gives a new error..Below is a snippet from the trace back
run RepeatModeler on 8 CPUs. Command:
/home/ivan/miniconda3/envs/rMAP-1.0/bin/RepeatModeler -engine ncbi -pa 8 -database ERR987781.fa.db > RepMod.out
Missing /home/ivan/miniconda3/envs/rMAP-1.0/share/RepeatMasker/Libraries/RepeatMasker.lib.nsq!
Please rerun the configure program in the RepeatModeler directory
before running this script.
RepeatModeler finished
Traceback (most recent call last):
File "miniconda3/envs/rMAP-1.0/config-files/MGERT.py", line 1732, in
I can share the assembly file too, you make a run and see what the issue could be.
Maybe some specific directions to look into: "Missing /home/ivan/miniconda3/envs/rMAP-1.0/share/RepeatMasker/Libraries/RepeatMasker.lib.nsq!" was an error i have no idea how to overcome
"Please rerun the configure program in the RepeatModeler directory before running this script." why do i have to re-run the configure yet everything set up successfully initially
Apparently there are some problems with RepeatMasker installation. BTW, have you tried to run MGERT in test mode after installation and configuration?
Below is the output fro the test run:
Run MGERT on small dataset, it may take a while...
MGERT will create a directory for test run in /home/ivan
Correspondence table is found and added to the config...
A list of smp files has been compiled.
Database name - CD
Local Conserved Domain Database is made and added to the config.
1/5. Starting RepeatModeler pipeline on 8 CPUs
Building RepeatModeler database. Command:
/home/ivan/miniconda3/envs/rMAP-1.0/bin/BuildDatabase -name test_scaffold.fasta.db -engine ncbi ref.fa
Building database test_scaffold.fasta.db:
Reading ref.fa...
Number of sequences (bp) added to database: 1 ( 4176476 bp )
run RepeatModeler on 8 CPUs. Command:
/home/ivan/miniconda3/envs/rMAP-1.0/bin/RepeatModeler -engine ncbi -pa 8 -database test_scaffold.fasta.db > RepMod.out
Missing /home/ivan/miniconda3/envs/rMAP-1.0/share/RepeatMasker/Libraries/RepeatMasker.lib.nsq!
Please rerun the configure program in the RepeatModeler directory
before running this script.
RepeatModeler finished
Traceback (most recent call last):
File "/usr/local/bin/MGERT.py", line 1668, in
I believe this tool is a very solid software, it just has a lot of tweaks from the installation to this part. Kindly help me work it out as i am not so good fixing python bugs. Also the instructions should be clear to use a de novo assembly fasta file; .fna is somewhat misleading
Well, the test failed because of RepeatMasker/RepeatModeler installation - RepeatModeler complains about missing library. It's not a python bug. Have you run the RepeatModeler configuration script? If not then you have to do that, else try to run it again.
Hi Andrew, I have run the pipeline but i get a trace back in the code snippet below for all my samples: "Traceback (most recent call last): File "../../miniconda3/envs/rMAP-1.0/config-files/MGERT.py", line 1732, in
l=args.min_length, e=args.e_value, c=args.start_codon, strnd=args.strand, g=args.genetic_code, le=args.left_end, re=args.right_end, rm_tab=args.rm_table)
File "../../miniconda3/envs/rMAP-1.0/config-files/MGERT.py", line 1398, in pipe
rmodeler(genome_file, threads)
File "../../miniconda3/envs/rMAP-1.0/config-files/MGERT.py", line 1239, in rmodeler
repmod_outfile = glob.glob("RM*/consensi.fa.classified")[0]
IndexError: list index out of range"
I am running it for .fna.gz files, both strands and would like to obtain the output files. How could i overcome this