Closed ISonets closed 3 years ago
I successfully converted GFF3 output into tab-delimeted TXT file using anvi-script-augustus-output-to-external-gene-calls
Are you sure you used anvi-script-augustus-output-to-external-gene-calls
?
Yes, I used this script to obtain TXT file from GFF3 output.
Yes, this was due to a legitimate bug in the dbops
module. It is now fixed in the main branch.
If you are following the active repository, you can run git pull
and everything should work for your data.
Thanks for the test case.
(if you want to try this solution, you will also need to re-run anvi-script-augustus-output-to-external-gene-calls
before re-running anvi-gen-contigs-database
).
Thanks a lot! I will try to rerun my analysis ASAP.
So, I confirm: scripts and contigs-db generation works as it should be!
BUT there is 1 important note about anvi-script-augustus-output-to-external-gene-calls
(at least in v.3.4.0). After working, script provides some descriptive data in aa_sequence column for each gene call hit (example on screen shot):
If you try to use this file to generate you contigs-db, errors will occur because this descriptive data is excess and adds more symbols. Anvi'o recognize this text as aa sequence, and this "sequence" is way longer than it should be. You have to remove this symbols using sed (or any other editor):
sed -i 's/].*//g' [replace this block with your txt file name]
# This data starts from "]Evidence of... (example on screenshot)". To remove it, use this command.
Results are:
After removing,all work just fine! Many thanks for such a quick fix!
P.S. @meren , can you please add this info into help page dedicated to script? I think it could be handy.
@ISonets, I thought I fixed it in anvi-script-augustus-output-to-external-gene-calls
via 82c672081565741d6cb6013c80255ee04bbdbb06 yesterday. Now I'm wondering if it wasn't comprehensive enough. Are you using the same file you sent me in issues.zip
when you observe those additional information added to AA seuqences?
Running this command on files you sent me yesterday,
anvi-script-augustus-output-to-external-gene-calls -i CL01_copy.gff -o ext.txt
Results in this file which doesn't have those excess text after AA sequences.
Oops, some misunderstanding. I sent you edited files(but didn't mention it, I am very sorry for this). Process was:
augustus-output-to-external-gene-calls
script => file with this additional infoaugustus-output-to-external-gene-calls
script => file with this additional info@ISonets, can you please send me the GFF file you get from the very first step?
Sure, I will sent you all source files ASAP.
Hmm, I think this is not your fault. I checked your commit about augustus (82c7620), and I don't have these changes. I have an explanation: I use anvi'o on server using SSH, and I don't have root on this server. So I'm unable to clone .git because libcurl3 wasn't installed (and I'm unable to install it). Instead I manualy edited dbops.py and augustus script as described in your 2 commtis describing this issue. It's quick-and-dirty solution, I know, but I can't find any other way to do my task.
I see. Then you should be able to copy the program anvi-script-augustus-output-to-external-gene-calls
from the main repository into your home directory, like this,
wget https://raw.githubusercontent.com/merenlab/anvio/master/sandbox/anvi-script-augustus-output-to-external-gene-calls
and run it the following way:
python anvi-script-augustus-output-to-external-gene-calls -i XXX -o XXX
Short description of the problem
1 major problem when dealing with AUGUSTUS gene calls(more details below)
anvi'o version
System info
The system is Ubuntu 16.04 LTS running on server.
Detailed description of the issue
Greetings! I am trying to use anvi'o for my Dekkera bruxellensis pangenomics project. I need to create contigs-db, and to do so, I made gene predictions using AUGUSTUS v.3.4.0 (as I understand, Prodigal isn't suitable because it's prokaryotic gene finder).I successfully converted GFF3 output into tab-delimeted TXT file using anvi-script-augustus-output-to-external-gene-calls .When I trying to create contigs-db, I had this error message:
I tried to fix it by simply removing 1 aa from this gene call (just deleted last aa in call) (I know this is completely wrong doing this way, and I tried to figure out what's wrong with my file, maybe just 1 mistake?), but there is more! Totally there were 29 similar messages. After removing 29 aa from my file in specific lines, anvi'o started doing its job, but then...
Something is definitely wrong, either with my files, or with anvi'o.
Files to reproduce
I have file for you to play with. In archive you can find FASTA file, GFF3 file and TXT file after convertation, I hope that fix will be over soon. issues.zip