Closed roperete closed 2 years ago
This is what I get upon running reasonaTE with -tool all
alvaro@mutant32:~:conda activate transposon_annotation_tools_env (transposon_annotation_tools_env) alvaro@mutant32:~:reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool all 7 ['scanHead', '-g', '/home/alvaro/workspace/testProject/sequence.fasta', '-bs', '0', '-o', '/home/alvaro/workspace/testProject/helitronScanner/scanHead.txt'] @@@ scanHead in >seq1 [Mon Apr 25 16:24:12 CEST 2022] [1:793080] [Mon Apr 25 16:24:12 CEST 2022] scanHead in >seq2 [Mon Apr 25 16:24:15 CEST 2022] [1:504300] [Mon Apr 25 16:24:15 CEST 2022]
scanTail in >seq1 [Mon Apr 25 16:24:17 CEST 2022] [1:793080] [Mon Apr 25 16:24:17 CEST 2022] scanTail in >seq2 [Mon Apr 25 16:24:21 CEST 2022] [1:504300] [Mon Apr 25 16:24:21 CEST 2022]
7 ['scanHead', '-g', '/home/alvaro/workspace/testProject/sequence_rc.fasta', '-bs', '0', '-o', '/home/alvaro/workspace/testProject/helitronScanner_rc/scanHead.txt'] @@@ scanHead in >seq1 [Mon Apr 25 16:24:24 CEST 2022] [1:793080] [Mon Apr 25 16:24:24 CEST 2022] scanHead in >seq2 [Mon Apr 25 16:24:28 CEST 2022] [1:504300] [Mon Apr 25 16:24:28 CEST 2022]
scanTail in >seq1 [Mon Apr 25 16:24:30 CEST 2022] [1:793080] [Mon Apr 25 16:24:30 CEST 2022] scanTail in >seq2 [Mon Apr 25 16:24:34 CEST 2022] [1:504300] [Mon Apr 25 16:24:34 CEST 2022]
sh: 1: gt: not found sh: 1: gt: not found /home/alvaro/anaconda3/envs/transposon_annotation_tools_env/bin/transposon_annotation_tools_mitefinderii/miteFinder_linux_x64 -pattern_scoring /home/alvaro/anaconda3/envs/transposon_annotation_tools_env/bin/transposon_annotation_tools_mitefinderii/pattern_scoring.txt -input /home/alvaro/workspace/testProject/sequence.fasta -output /home/alvaro/workspace/testProject/mitefind/result.txt
##############
/home/alvaro/anaconda3/envs/transposon_annotation_tools_env/bin/transposon_annotation_tools_mitefinderii/miteFinder_linux_x64 -pattern_scoring /home/alvaro/anaconda3/envs/transposon_annotation_tools_env/bin/transposon_annotation_tools_mitefinderii/pattern_scoring.txt -input /home/alvaro/workspace/testProject/sequence_rc.fasta -output /home/alvaro/workspace/testProject/mitefind_rc/result.txt
##############
sh: 1: mitetracker: not found
sh: 1: mitetracker: not found
Traceback (most recent call last):
File "/home/alvaro/anaconda3/envs/transposon_annotation_tools_env/share/TransposonAnnotator_reasonaTE/TransposonAnnotator.py", line 86, in
Dear Álvaro, first of all thank you very much for your interest in our software.
To better help you, it might be great if you could call reasonate and run specific tools, and share the console output with us. For example
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool mitetracker
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool must
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool sinescan
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool transposonPSI
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool NCBICDD1000
Could you moreover please answer following questions?
Looking forward to your answers to best help and support you in using resonaTE, Best, Kevin Riehl
Dear Kevin, Thanks for your response and will to help.
reasonaTE Test Console Output.txt
I think it may be relevant that during the installation of the failing tools (proteinncbicdd1000, transposonpsicli. mitetracker, mustv2, sinescan) this error is prompt: Installation Glibc error.txt
Dear Álvaro, thanks for the detailed answer.
Indeed, I agree it seems the issue that the software could not be installed properly on your Linux System (Ubuntu).
One of the challenges, when working with Conda and Mamba is that the underlying package management systems and the servers are dynamically changing and probably dont cause the same behavior like for me.
Could you please try to create a new conda environment, try just to install transposon annotation tools, with conda if possible, and run the genomes with the tools separately in another folder. If this worked, I will explain you how to copy paste their results into your reasonaTE project folder to proceed.
Best and good luck, Kevin
Dear Kevin,
I did circumvent all the issues by uninstalling and re-installing everything. For any possible user that finds similar problems:
I added bioconda, conda-forge and derkevinriehl as default channels for conda
conda config --add channels new_channel
instead of specifying the channel in the installing command.
I used the plain conda option, as well.
However, RepeatModeler did not correctly work, so I had to install it from the original source and run it separately.
Therefore, I would be really grateful if you would explain me how to copy paste the results for RepeatModeler2 into my reasonaTE folder to proceed.
Thanks a bunch!
Dear Álvaro, great to hear back from you :-)!
I am happy to hear that reinstalling did the trick. Even though we tried our best making the packages, the package management system server (conda) is constantly changing and therefore sometimes causing these kind of issues.
To your question: In general you can orient yourself to the folder structure of the sample project of reasonaTE. In the folder "repeatmodel" you should have following files (=output from RepeatModeler). https://github.com/DerKevinRiehl/transposon_annotation_reasonaTE/tree/main/workspace/testProject/repeatmodel
As you might see in the code of reasonaTE Line 1213: https://github.com/DerKevinRiehl/transposon_annotation_reasonaTE/blob/cc04a2db30c98f21981eb2d90710887be726bfcc/Code/AnnotationParser.py#L1208 The only file important from reasonaTE from RepeatModeler output is the file "sequence_index-families.stk".
So all you need to do is copy your *-families.stk file to the folder "repeatmodel" in your project folder.
Then you can check if reasonaTE can find the annotations by:
reasonaTE -mode checkAnnotations -projectFolder workspace -projectName testProject
Afterwards you can proceed with the parsing step.
reasonaTE -mode parseAnnotations -projectFolder workspace -projectName testProject
Please let me know if you face any more issues, I am happy to help, and sure that you are almost there :-) with successfully using our tool.
Best, Kevin
Dear Kevin,
Thanks for your quick reply and help.
I have indeed found the file (merely called families.stk) amongst the other result files from RepeatModeler. After copying it to the folder, it checks the annotation as complete, but the parsing gives the following error:
Parse repeatModeler... Traceback (most recent call last): File "/home/alvaro/anaconda3/envs/transposon_annotation_tools_env/share/TransposonAnnotator_reasonaTE/TransposonAnnotator.py", line 114, in <module> parseAvailableResults(projectFolderPath) File "/home/alvaro/anaconda3/envs/transposon_annotation_tools_env/share/TransposonAnnotator_reasonaTE/AnnotationParser.py", line 1346, in parseAvailableResults parseRepeatModeler(pathResDir, fastaFile, targetGFFFile, targetGFFrepe, targetFastaFile) File "/home/alvaro/anaconda3/envs/transposon_annotation_tools_env/share/TransposonAnnotator_reasonaTE/AnnotationParser.py", line 1255, in parseRepeatModeler seqTypeLabelA = seqType.split(";")[1] IndexError: list index out of range
I did run RepeatModeler2 with -LTRStruct activated. Should I not do that when intending to run the results through the pipeline?
Thanks :)
Dear Álvaro, seems you hit the exact same problem like someone before. The issue is, that RepeatModeler is sometimes producing empty lines in the stockholm file, even though this violates the file standard. I wrote a small script that you can use to "clean" your stockholm files, as described in this thread: https://github.com/DerKevinRiehl/TransposonUltimate/issues/3#issuecomment-1117307262
Please let me know if this did the trick. Best, Kevin
Dear Kevin,
The program returns an empty .stk file as an output. It seems to delete everything.
Best, Alvaro
Dear Álvaro, could you please share your original STK file, it seems your version of RepeatModeler has yet another output format than the one I dealt with in the github issue mentioned before.
I will have a look on your STK file and adopt the script.
Thanks, Best regards, Kevin
Dear Kevin,
I hereby attach the families file. But the problem might not be your program but my RepeatModeler.
This file is not the
The file attached (and the one I was attempting to use) is the families.stk file inside the RM_
Thanks, Kind regards, Álvaro
Dear Álvaro, after checking the error message again, I found a reason for this behavior. The stockholm file does not contain the repeat type. Normally Repeatmodeler should write "Unknown" but in your case it simply didnt add any information to the file.
Therefore I wrote a new corrector, as you can find below. Please use this one, and tell me if it did the trick. I attached the updated corrector.py as well as your updated stk file families.stk.
Please use following code below:
python corrector.py FROM_FILE.stk TO_FILE.stk
Here is the code of the corrector:
# Author: Kevin Riehl for Transposon Ultimate Problems with RepeatModeler Outputs C 2022
# This code loads annotation outputs from RepeatModeler in Stockholm format that misses repeat type, and adds it,
# as these missing values cause errors in the downstream pipeline of reasonaTE
# Usage: python corrector.py FROM_FILE.stk TO_FILE.stk
# get arguments
import sys
arguments = sys.argv
print(arguments)
if(len(arguments)==3):
from_file = arguments[1]
to_file = arguments[2]
# read file and erase empty lines
f1 = open(from_file, "r")
f2 = open(to_file, "w+")
line = " "
last_line = " "
ctr = 0
while line!="":
last_line = line
line = f1.readline()
if not (len(line.replace("\n",""))==0):
if(line.startswith("# STOCKHOLM")):
line = line + "#=GF TP Unknown;Unknown\n"
f2.write(line)
else:
print(ctr)
ctr+=1
f1.close()
f2.close()
else:
print("ERROR! No two arguments given from_file and to_file given!")
Please let me know if this did the trick for you. Best regards, Kevin
Dear Kevin, The updated corrector does indeed the trick. On the time being, I figured why I was not obtaining the right outputs. All is working now with RepeatModeler and RepeatMasker.
Thanks a lot for your interest and help. Very much appreciated!!!
Hello,
I attempt to run reasonaTE with the provided test sequence, or my sequences, and I always get these results. I expect ltrPred not to be complete because I have not installed it yet, but the rest should be installed OK (Installation using conda and mamba).
Any clue on what could possibly be wrong? Any help will be much appreciated.
(transposon_annotation_tools_env) alvaro@alive:~:reasonaTE -mode checkAnnotations -projectFolder workspace -projectName testProject Checking helitronScanner ... completed Checking ltrHarvest ... completed Checking ltrPred ... not completed Checking mitefind ... completed Checking mitetracker ... not completed Checking must ... not completed Checking repeatmodel ... not completed Checking repMasker ... not completed Checking sinefind ... completed Checking sinescan ... not completed Checking tirvish ... completed Checking transposonPSI ... not completed Checking NCBICDD1000 ... not completed
Kind regards,
Álvaro