lmrodriguezr / enveomics

Scripts and libraries for Environmental Genomics
http://enve-omics.ce.gatech.edu/enveomics
Other
37 stars 28 forks source link

ogs.annotate.rb problem #23

Open blancaverag opened 7 years ago

blancaverag commented 7 years ago

ogs.annotate.rb does not annotate my OG matrix, but I do not know what the problem is. I am attaching my input files (OGs_matrix.txt, annotation.txt) and output file (OGs_matrix_annotated.txt). I am running the program as:

ogs.annotate.rb --in OGs_matrix.txt --out OGs_matrix_annotated.txt -a annotation.txt Reading pre-computed OGs in 'OGs_matrix.txt'. Loaded OGs: 2171. Warning: Cannot find 1859 genes from annotation in OG collection. Saving annotated OGs into 'OGs_matrix_annotated.txt'. Done.

The names of the genes are the same in OGs_matrix.txt and annotation.txt, and annotation.txt is tab delimited as required. What may I have missed?

Thank you!

annotation.txt OGs_matrix.txt OGs_matrix_annotated.txt

lmrodriguezr commented 7 years ago

Hello @blancaverag,

I'm sorry about this, I realize now that the documentation is incomplete for this script. The issue is that you must use the name of the genome in the annotation file, so the script knows in which genome to look for those gene names. In your case, the annotation is for the SSL50 genome, so you should rename the annotation file to SSL50.txt:

$> mv annotation.txt SSL50.txt
$> ogs.annotate.rb --in OGs_matrix.txt --out OGs_matrix_annotated.txt -a SSL50.txt
Reading pre-computed OGs in 'OGs_matrix.txt'.
 Loaded OGs: 2171.
Warning: Cannot find 33 genes from SSL50 in OG collection.
Saving annotated OGs into 'OGs_matrix_annotated.txt'.
Done.

And this produces an output with annotated OGs based on SSL50. I'll leave this issue open for a while until I improve the documentation for the script, but you should be able to run it like this now.

Thanks for the report!

blancaverag commented 7 years ago

Hello @lmrodriguezr!

Thank you for your fast answer. And for the useful scripts! It works now.