genomeannotation / annie

annie = ANNotation Information Extractor
MIT License
6 stars 3 forks source link

Annie unable to process #7

Open emmannaemeka opened 6 years ago

emmannaemeka commented 6 years ago

@bruab @tedsta @Woods26 @smg283 @Juke34 I am having problem running Annie i don't know what is wrong with the syntax

python annie.py -b /Users/emmannaemeka/Desktop/Gpm/maker/data/mucuna/muc1_genome_snap2.maker.output/muc1_snap2.2.blastp --gff /Users/emmannaemeka/Desktop/Gpm/maker/data/mucuna/muc1_genome_snap2.maker.output/muc1_genome_snap2.all.gff -db /Users/emmannaemeka/db/blast/uniprot_sprot.fasta -o muc1

Traceback (most recent call last): File "annie.py", line 122, in main(sys.argv) File "annie.py", line 78, in main annotations.extend(read_sprot(blast_file, gff_file, fasta_file)) File "/Users/emmannaemeka/Desktop/genomeannotation-annie-4bb3980/src/sprot.py", line 9, in read_sprot gff_info = get_gff_info(gff_file) File "/Users/emmannaemeka/Desktop/genomeannotation-annie-4bb3980/src/sprot.py", line 80, in get_gff_info key, val = split[0], split[1] IndexError: list index out of range

erikenbody commented 6 years ago

I received the same error message. For me, the issue was in parsing lines in my .gff (that originated from a Maker run) that had a semi-colon at the end of the line. This semi-colon had been created (apparently erroneously) by the Maker script map_gff_ids.

I simply removed the trailing semi-colon in my gff with sed as so:

sed 's/;$//' ORIGINAL_WITH_TRAILING_SEMIS.gff> TRAILING_SEMIS_REMOVED.gff

emmannaemeka commented 6 years ago

@erikenbody Thanks for the tip. I stripped my .gff of the semi colons as you did but when I ran annie it returned the following

CR513_031089-RA not in gff. Skipping... CR513_038690-RA not in gff. Skipping... CR513_056208-RA not in gff. Skipping... CR513_035954-RA not in gff. Skipping... CR513_000871-RA not in gff. Skipping... CR513_031580-RA not in gff. Skipping... CR513_043484-RA not in gff. Skipping... CR513_047162-RA not in gff. Skipping... CR513_001912-RA not in gff. Skipping... CR513_051108-RA not in gff. Skipping... CR513_054095-RA not in gff. Skipping... CR513_023079-RA not in gff. Skipping... CR513_039854-RA not in gff. Skipping... CR513_014800-RA not in gff. Skipping... CR513_060796-RA not in gff. Skipping... CR513_060775-RA not in gff. Skipping... CR513_036753-RA not in gff. Skipping... CR513_034492-RA not in gff. Skipping... CR513_009451-RA not in gff. Skipping... CR513_005349-RA not in gff. Skipping... CR513_035516-RA not in gff. Skipping... CR513_008588-RA not in gff. Skipping... CR513_049549-RA not in gff. Skipping... CR513_005366-RA not in gff. Skipping... CR513_056407-RA not in gff. Skipping... CR513_007331-RA not in gff. Skipping... CR513_003640-RA not in gff. Skipping... CR513_061538-RA not in gff. Skipping... CR513_052887-RA not in gff. Skipping... CR513_007161-RA not in gff. Skipping... CR513_022282-RA not in gff. Skipping... CR513_022758-RA not in gff. Skipping... CR513_037748-RA not in gff. Skipping... CR513_000863-RA not in gff. Skipping... CR513_029922-RA not in gff. Skipping... CR513_034556-RA not in gff. Skipping... CR513_023318-RA not in gff. Skipping... CR513_032313-RA not in gff. Skipping... CR513_030941-RA not in gff. Skipping... CR513_051554-RA not in gff. Skipping... CR513_049945-RA not in gff. Skipping... CR513_033889-RA not in gff. Skipping... CR513_060399-RA not in gff. Skipping... CR513_059241-RA not in gff. Skipping... CR513_015668-RA not in gff. Skipping... CR513_019489-RA not in gff. Skipping... CR513_000891-RA not in gff. Skipping... CR513_009975-RA not in gff. Skipping... CR513_052791-RA not in gff. Skipping... CR513_030939-RA not in gff. Skipping... CR513_049641-RA not in gff. Skipping... CR513_019480-RA not in gff. Skipping... CR513_045824-RA not in gff. Skipping... CR513_045788-RA not in gff. Skipping... CR513_008113-RA not in gff. Skipping... CR513_056041-RA not in gff. Skipping... CR513_036188-RA not in gff. Skipping... CR513_002989-RA not in gff. Skipping... CR513_010367-RA not in gff. Skipping... CR513_042104-RA not in gff. Skipping... CR513_032715-RA not in gff. Skipping... CR513_011979-RA not in gff. Skipping... CR513_040685-RA not in gff. Skipping... CR513_032312-RA not in gff. Skipping... CR513_013914-RA not in gff. Skipping... CR513_050572-RA not in gff. Skipping... CR513_040000-RA not in gff. Skipping... CR513_005838-RA not in gff. Skipping... CR513_058137-RA not in gff. Skipping... CR513_038615-RA not in gff. Skipping... CR513_054283-RA not in gff. Skipping... CR513_003641-RA not in gff. Skipping... CR513_056406-RA not in gff. Skipping... CR513_061536-RA not in gff. Skipping... CR513_056821-RA not in gff. Skipping... CR513_037365-RA not in gff. Skipping... CR513_054917-RA not in gff. Skipping... CR513_022283-RA not in gff. Skipping... CR513_044227-RA not in gff. Skipping... CR513_054313-RA not in gff. Skipping... CR513_041416-RA not in gff. Skipping... CR513_008590-RA not in gff. Skipping... CR513_033279-RA not in gff. Skipping... CR513_005956-RA not in gff. Skipping... CR513_026691-RA not in gff. Skipping... CR513_062788-RA not in gff. Skipping... CR513_053932-RA not in gff. Skipping... CR513_027424-RA not in gff. Skipping... CR513_053566-RA not in gff. Skipping... CR513_011251-RA not in gff. Skipping... CR513_049638-RA not in gff. Skipping... CR513_000519-RA not in gff. Skipping... CR513_005957-RA not in gff. Skipping... CR513_046940-RA not in gff. Skipping... CR513_039535-RA not in gff. Skipping... CR513_056914-RA not in gff. Skipping... CR513_035949-RA not in gff. Skipping... CR513_013395-RA not in gff. Skipping... CR513_010073-RA not in gff. Skipping... CR513_053536-RA not in gff. Skipping... CR513_060870-RA not in gff. Skipping... CR513_018447-RA not in gff. Skipping... CR513_060821-RA not in gff. Skipping... CR513_058036-RA not in gff. Skipping... CR513_027234-RA not in gff. Skipping... CR513_056337-RA not in gff. Skipping... CR513_051825-RA not in gff. Skipping... CR513_037530-RA not in gff. Skipping... CR513_049548-RA not in gff. Skipping... CR513_031308-RA not in gff. Skipping... CR513_051618-RA not in gff. Skipping... CR513_045163-RA not in gff. Skipping... CR513_000874-RA not in gff. Skipping... CR513_061449-RA not in gff. Skipping... CR513_013913-RA not in gff. Skipping... CR513_042075-RA not in gff. Skipping... CR513_058719-RA not in gff. Skipping... CR513_025295-RA not in gff. Skipping... CR513_050589-RA not in gff. Skipping... CR513_032314-RA not in gff. Skipping... CR513_009444-RA not in gff. Skipping... CR513_018579-RA not in gff. Skipping... CR513_025631-RA not in gff. Skipping... CR513_054058-RA not in gff. Skipping... CR513_030347-RA not in gff. Skipping... CR513_041558-RA not in gff. Skipping... CR513_009450-RA not in gff. Skipping... CR513_052078-RA not in gff. Skipping... CR513_051107-RA not in gff. Skipping... CR513_063172-RA not in gff. Skipping... CR513_045741-RA not in gff. Skipping... CR513_049231-RA not in gff. Skipping... CR513_059135-RA not in gff. Skipping... CR513_004595-RA not in gff. Skipping... CR513_020648-RA not in gff. Skipping... CR513_014654-RA not in gff. Skipping... CR513_019488-RA not in gff. Skipping... CR513_014575-RA not in gff. Skipping... CR513_020649-RA not in gff. Skipping... CR513_014492-RA not in gff. Skipping... CR513_024347-RA not in gff. Skipping... CR513_025056-RA not in gff. Skipping... CR513_002906-RA not in gff. Skipping... CR513_063021-RA not in gff. Skipping... CR513_000398-RA not in gff. Skipping... CR513_000880-RA not in gff. Skipping... CR513_019794-RA not in gff. Skipping... CR513_000934-RA not in gff. Skipping... CR513_021286-RA not in gff. Skipping... CR513_001911-RA not in gff. Skipping... CR513_023996-RA not in gff. Skipping... CR513_024346-RA not in gff. Skipping... CR513_030135-RA not in gff. Skipping... CR513_022757-RA not in gff. Skipping... CR513_040923-RA not in gff. Skipping... CR513_033835-RA not in gff. Skipping... CR513_032311-RA not in gff. Skipping... CR513_052942-RA not in gff. Skipping... CR513_053406-RA not in gff. Skipping... CR513_048457-RA not in gff. Skipping... CR513_058112-RA not in gff. Skipping... CR513_000445-RA not in gff. Skipping... CR513_003643-RA not in gff. Skipping... CR513_030397-RA not in gff. Skipping... CR513_049640-RA not in gff. Skipping... CR513_048113-RA not in gff. Skipping... CR513_003404-RA not in gff. Skipping... CR513_006312-RA not in gff. Skipping... CR513_005602-RA not in gff. Skipping... CR513_030942-RA not in gff. Skipping... CR513_058718-RA not in gff. Skipping... CR513_062390-RA not in gff. Skipping... CR513_061554-RA not in gff. Skipping... CR513_055079-RA not in gff. Skipping... CR513_030943-RA not in gff. Skipping... CR513_042676-RA not in gff. Skipping... CR513_005968-RA not in gff. Skipping... CR513_030100-RA not in gff. Skipping... CR513_025812-RA not in gff. Skipping... CR513_029245-RA not in gff. Skipping... CR513_041474-RA not in gff. Skipping... CR513_028450-RA not in gff. Skipping... CR513_046207-RA not in gff. Skipping... CR513_005568-RA not in gff. Skipping... CR513_018864-RA not in gff. Skipping... CR513_002518-RA not in gff. Skipping... CR513_057116-RA not in gff. Skipping... CR513_059786-RA not in gff. Skipping... CR513_054749-RA not in gff. Skipping... CR513_041011-RA not in gff. Skipping... CR513_022337-RA not in gff. Skipping... CR513_025442-RA not in gff. Skipping... CR513_056049-RA not in gff. Skipping... CR513_013040-RA not in gff. Skipping... CR513_036690-RA not in gff. Skipping... CR513_047164-RA not in gff. Skipping... CR513_056184-RA not in gff. Skipping... CR513_029268-RA not in gff. Skipping... CR513_063165-RA not in gff. Skipping... CR513_045407-RA not in gff. Skipping... CR513_051544-RA not in gff. Skipping... CR513_056182-RA not in gff. Skipping... CR513_041439-RA not in gff. Skipping... CR513_002412-RA not in gff. Skipping... CR513_015477-RA not in gff. Skipping... CR513_000448-RA not in gff. Skipping... CR513_037696-RA not in gff. Skipping... CR513_047240-RA not in gff. Skipping... CR513_053257-RA not in gff. Skipping... CR513_032208-RA not in gff. Skipping... CR513_029269-RA not in gff. Skipping... CR513_000864-RA not in gff. Skipping... CR513_011980-RA not in gff. Skipping... CR513_019795-RA not in gff. Skipping... CR513_016853-RA not in gff. Skipping... CR513_000434-RA not in gff. Skipping... CR513_045768-RA not in gff. Skipping... CR513_049991-RA not in gff. Skipping... CR513_007567-RA not in gff. Skipping... CR513_016282-RA not in gff. Skipping... CR513_025669-RA not in gff. Skipping... CR513_051590-RA not in gff. Skipping... CR513_024349-RA not in gff. Skipping... CR513_063177-RA not in gff. Skipping... CR513_028118-RA not in gff. Skipping... CR513_018732-RA not in gff. Skipping... CR513_032628-RA not in gff. Skipping... CR513_036754-RA not in gff. Skipping... CR513_023321-RA not in gff. Skipping... CR513_058018-RA not in gff. Skipping... CR513_034551-RA not in gff. Skipping... CR513_046793-RA not in gff. Skipping... CR513_012976-RA not in gff. Skipping... CR513_016567-RA not in gff. Skipping... CR513_052058-RA not in gff. Skipping... CR513_053036-RA not in gff. Skipping... CR513_003595-RA not in gff. Skipping... CR513_022537-RA not in gff. Skipping... CR513_002591-RA not in gff. Skipping... CR513_000872-RA not in gff. Skipping... CR513_011385-RA not in gff. Skipping... CR513_059951-RA not in gff. Skipping... CR513_010574-RA not in gff. Skipping... CR513_017436-RA not in gff. Skipping... CR513_051324-RA not in gff. Skipping... CR513_033462-RA not in gff. Skipping... CR513_031339-RA not in gff. Skipping... CR513_011420-RA not in gff. Skipping... CR513_007417-RA not in gff. Skipping... CR513_060362-RA not in gff. Skipping... CR513_006948-RA not in gff. Skipping... CR513_005369-RA not in gff. Skipping... CR513_063162-RA not in gff. Skipping... CR513_033700-RA not in gff. Skipping... CR513_023953-RA not in gff. Skipping... CR513_045728-RA not in gff. Skipping... CR513_000763-RA not in gff. Skipping... CR513_009404-RA not in gff. Skipping... CR513_052827-RA not in gff. Skipping... CR513_023382-RA not in gff. Skipping... CR513_001893-RA not in gff. Skipping... CR513_024110-RA not in gff. Skipping... CR513_003639-RA not in gff. Skipping... CR513_060363-RA not in gff. Skipping... CR513_021512-RA not in gff. Skipping... CR513_038904-RA not in gff. Skipping... CR513_003025-RA not in gff. Skipping... CR513_016607-RA not in gff. Skipping... CR513_000866-RA not in gff. Skipping... CR513_002094-RA not in gff. Skipping... CR513_012974-RA not in gff. Skipping... CR513_007162-RA not in gff. Skipping... CR513_056177-RA not in gff. Skipping... CR513_003226-RA not in gff. Skipping... CR513_006645-RA not in gff. Skipping... CR513_063170-RA not in gff. Skipping... CR513_034917-RA not in gff. Skipping... CR513_010563-RA not in gff. Skipping... CR513_030991-RA not in gff. Skipping... CR513_050539-RA not in gff. Skipping... CR513_060869-RA not in gff. Skipping... CR513_060220-RA not in gff. Skipping... CR513_000779-RA not in gff. Skipping... CR513_034426-RA not in gff. Skipping... CR513_052813-RA not in gff. Skipping... CR513_035115-RA not in gff. Skipping... CR513_024475-RA not in gff. Skipping... CR513_014353-RA not in gff. Skipping... CR513_057537-RA not in gff. Skipping... CR513_043338-RA not in gff. Skipping... CR513_063176-RA not in gff. Skipping... CR513_054930-RA not in gff. Skipping... CR513_054934-RA not in gff. Skipping... CR513_025296-RA not in gff. Skipping... CR513_057234-RA not in gff. Skipping... CR513_030453-RA not in gff. Skipping... CR513_040559-RA not in gff. Skipping... CR513_008674-RA not in gff. Skipping... CR513_031579-RA not in gff. Skipping... CR513_000409-RA not in gff. Skipping... CR513_050325-RA not in gff. Skipping... CR513_057642-RA not in gff. Skipping... CR513_053037-RA not in gff. Skipping... CR513_063167-RA not in gff. Skipping... CR513_001040-RA not in gff. Skipping... CR513_026063-RA not in gff. Skipping... CR513_031582-RA not in gff. Skipping... CR513_017650-RA not in gff. Skipping... CR513_058198-RA not in gff. Skipping... CR513_049113-RA not in gff. Skipping... CR513_021325-RA not in gff. Skipping... CR513_018550-RA not in gff. Skipping... CR513_044600-RA not in gff. Skipping... CR513_000875-RA not in gff. Skipping... CR513_002589-RA not in gff. Skipping... CR513_060185-RA not in gff. Skipping... CR513_050379-RA not in gff. Skipping... CR513_008585-RA not in gff. Skipping... CR513_009911-RA not in gff. Skipping... CR513_005831-RA not in gff. Skipping... CR513_042105-RA not in gff. Skipping... CR513_061448-RA not in gff. Skipping... CR513_061543-RA not in gff. Skipping... CR513_002110-RA not in gff. Skipping... CR513_036679-RA not in gff. Skipping... CR513_012987-RA not in gff. Skipping... CR513_002111-RA not in gff. Skipping... CR513_030687-RA not in gff. Skipping... CR513_005365-RA not in gff. Skipping... CR513_011662-RA not in gff. Skipping... CR513_003591-RA not in gff. Skipping... CR513_063178-RA not in gff. Skipping... CR513_040288-RA not in gff. Skipping... CR513_025075-RA not in gff. Skipping... CR513_032528-RA not in gff. Skipping... CR513_050499-RA not in gff. Skipping... CR513_012975-RA not in gff. Skipping... CR513_028813-RA not in gff. Skipping... CR513_043866-RA not in gff. Skipping... CR513_008159-RA not in gff. Skipping... CR513_055073-RA not in gff. Skipping... CR513_027400-RA not in gff. Skipping... CR513_063171-RA not in gff. Skipping... CR513_021116-RA not in gff. Skipping... CR513_058083-RA not in gff. Skipping... CR513_059911-RA not in gff. Skipping... CR513_036680-RA not in gff. Skipping... CR513_004799-RA not in gff. Skipping... CR513_008651-RA not in gff. Skipping... CR513_000162-RA not in gff. Skipping... CR513_032689-RA not in gff. Skipping... CR513_039920-RA not in gff. Skipping... CR513_043045-RA not in gff. Skipping... CR513_023320-RA not in gff. Skipping...

I have attached the .gff and the blast output. Can you take a look at it?

Mucuna_combined_TRAILING_SEMIS_REMOVED_2.gff.zip muc1_snap2.blastp.zip

erikenbody commented 6 years ago

It looks like you renamed the features in the .gff (e.g. CR513_023320-RA), but used the original Maker output IDs (e.g. maker-contig_8151-snap-gene-0.6) when generating the blast report. So you could re-run blast with the renamed gff, or just run Annie with the original .gff that wasnt renamed (i.e. doesn't have the alias added). You probably want to just re-run blast with the new names so that you can integrate it into the .gff later as needed (through Maker's helper scripts, for example).