sestaton / HMMER2GO

Annotate DNA sequences for Gene Ontology terms
MIT License
40 stars 10 forks source link

map2gaf problem #26

Open Estevelag opened 3 years ago

Estevelag commented 3 years ago

Hello Evan. Thank you for creating this software, it has been really helpful. I'm having an issue with the map2gaf command and I don't know why. The gaf file is empty, and the log while i created is:

Use of uninitialized value $go_mappings[4] in split at /usr/local/share/perl/5.26.1/HMMER2GO/Command/map2gaf.pm line 93, <$in> line 38952. (this log repeats through all of the numbers

The command line I input is: hmmer2go map2gaf -i genes_orfs_Pfam-A_GO_GOterm_mapping.tsv -o genes_orfs_Pfam-A_GO_GOterm_mapping.gaf -s 'Cirriformia mooreii'

I have the go file and the Pfam2go file in my workspace since I'm doing this locally because For some reason I can't download files from the internet.

Thank you for any help you could give me Esteban Velásquez Agudelo

sestaton commented 3 years ago

Hello Esteban,

If you are able to get this GO file you can pass it to the command as an option:

hmmer2go map2gaf -i genes_orfs_Pfam-A_GO_GOterm_mapping.tsv -o genes_orfs_Pfam-A_GO_GOterm_mapping.gaf -s 'Cirriformia mooreii' -g go.obo

That would be one part. The error suggests that the input file may be empty or malformed. Can you confirm that genes_orfs_Pfam-A_GO_GOterm_mapping.tsv in the command above is non-empty? If there is data in the file I will have to take a look at the format. If it is empty, we will have to back up a step and see what might have went wrong a previous step.

Thanks, Evan

robpade commented 3 years ago

I have the same issue as Esteban. I got the GO file you suggested and passed it with the -g option. There was no improvement. Looking at my genes_orfs_Pfam-A_GO_GOterm_mapping.tsv file, output from "hmmer2go mapterms", each row looks valid, with the first column having an ID such as "JSWQ01000006.1" and the second column having one or more comma-separated GO terms such as "GO:0022857,GO:0055085,GO:0016020".

As for Esteban, I get an empty GAF file and a comparable error for each entry in genes_orfs_Pfam-A_GO_GOterm_mapping.tsv.

hunterkwalt commented 2 years ago

I am having this same issue. I am using the -g option with the GO file you suggested and my mapping files look correct. I am getting the same error and an empty GAF file. Running it on Docker.

Thanks, Hunter

sestaton commented 2 years ago

Please try the latest version (v0.18.1), which should resolve this issue. I will leave this issue open for now.

hunterkwalt commented 2 years ago

This did not fix my issue. I installed v0.18.1 on my system and got the same error:

Use of uninitialized value $go_mappings[4] in split at /usr/local/share/perl/5.26.1/HMMER2GO/Command/map2gaf.pm line 93, <$in> line 1.

Above is the error and it shows up for every line.

Thanks, Hunter

sestaton commented 2 years ago

@hunterkwalt, Can you share a sample of the input file or reproduce it with public data?

I would like to take a look further to understand what is causing the issues.

hunterkwalt commented 2 years ago

XP_014255174.1 PF00012 HSP70 GO:ATP binding GO:0005524 Hsp70 protein XP_014255174.1 PF00012 HSP70 GO:ATPase activity GO:0016887 Hsp70 protein XP_014253279.1 PF00041 fn3 GO:protein binding GO:0005515 Fibronectin type III domain XP_014253277.1 PF00041 fn3 GO:protein binding GO:0005515 Fibronectin type III domain XP_014253278.1 PF00041 fn3 GO:protein binding GO:0005515 Fibronectin type III domain XP_014254093.1 PF00067 p450 GO:iron ion binding GO:0005506 Cytochrome P450 XP_014254094.1 PF00067 p450 GO:iron ion binding GO:0005506 Cytochrome P450 XP_014254092.1 PF00067 p450 GO:iron ion binding GO:0005506 Cytochrome P450

This is a sample of my input file after mapping GO terms.

Also, I don't know if this has anything to do with it, but I already had translated sequences so I did not run the getorf step in the beginning of the analysis.

Thanks, Hunter

sestaton commented 2 years ago

@hunterkwalt, please show the commands you used to generate the file and what specific command caused the error.

I ran a full example with Arabidopsis, and also used the example you pasted. In both cases I got the expected output with no error messages. The only thing I changed from your example is to take the text from the comment and format it back to tab-delimited as the program should generate. There should not be any differences between the Docker and standard install, but I will test that as well.

I will follow up to see what is causing the errors and get a fix uploaded as soon as we identify the issue.

leorippel commented 2 years ago

I was having the same issue.. I tried a bunch of different things... Finally I found the error, at least for me:

In your tutorial, the example input is genes_orfs_Pfam-A_GO_GOterm_mapping.tsv

hmmer2go map2gaf -i **genes_orfs_Pfam-A_GO_GOterm_mapping.tsv** -o genes_orfs_Pfam-A_GO_GOterm_mapping.gaf -s 'Helianthus annuus'

When I run the map2gaf with genes_orfs_Pfam-A_GO.tsv works fine.

!gaf-version: 2.1 ! File generated by HMMER2GO (v0.18.2): https://github.com/sestaton/HMMER2GO ! Date generated on: 20220504 ! Generated from GO ontology format version: 1.2 ! Generated from GO ontology data version: 2022-03-22 !=========================================================================== Pfam LOC100806630 LOC100806630 GO:0004930 IEA F G protein-coupled receptor activity |gene taxon:3847 20220504 Pfam Pfam LOC100802104 LOC100802104 GO:0004930 IEA F G protein-coupled receptor activity |gene taxon:3847 20220504 Pfam Pfam LOC100798776 LOC100798776 GO:0004930 IEA F G protein-coupled receptor activity |gene taxon:3847 20220504 Pfam Pfam LOC100801421 LOC100801421 GO:0004930 IEA F G protein-coupled receptor activity |gene taxon:3847 20220504 Pfam

Maybe you should correct that on your tutorial

"This last command will create two output files:

genes_orfs_Pfam-A_GO.tsv, and genes_orfs_Pfam-A_GO_GOterm_mapping.tsv"

OR AM I GETTING A WRONG gaf FILE????

sestaton commented 2 years ago

Thank you for the great comment @leorippel! Sorry for your troubles. The Demo page is up-to-date but the Tutorial is probably not it would seem.

I will combine and update the wiki pages.

sestaton commented 2 years ago

The filename has been update on the tutorial page so the documentation should be in sync with the latest version