Closed SilasK closed 5 years ago
Hi @SilasK,
That is correct. The original plan was to simply parse the long format output from assign_genome_properties.pl
and this is what the code can do currently.
However, I noticed that the long form file format only contained results for lower level genome properties such as Systems and Pathways but not for all types of genome properties (e.g. Catagories) in the tree. See the diagram below:
Since my visualization software uses all levels of genome properties I had to write code to do my own assignments for higher level properties.
My assignment code can be found in this file: https://github.com/Micromeda/pygenprop/blob/master/pygenprop/results.py
Specifically, the following functions:
These functions could potentially be used to assign genome property results InterProScan output.
To make assignments right from InterProScan results would need to do the following. Note: this is based on my basic understanding of the Perl code in assign_genome_properties.pl
. I still need to reverse engineer it further to have a better understanding of it.
InterProScan.tsv
extract the column with InterPro identifiers.assign_genome_properties.pl
@SilasK Prototype code is here: https://github.com/Micromeda/pygenprop/blob/assign_from_interpro_scan/prototype_assign_from_interproscan.ipynb
Looks like there are some anomalies. I am investigating.
Great, I will look at it. I found out that assign_genome_properties.pl uses only the annotations (Pfam, Tigrfam,.. ) and apparently not the Interpro ids.
On Tue, Dec 11, 2018, 02:46 Lee Bergstrand notifications@github.com wrote:
Looks like there are some anomalies. I am investigating.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Micromeda/pygenprop/issues/32#issuecomment-446041844, or mute the thread https://github.com/notifications/unsubscribe-auth/AHLK2umktpd_2Mb1GmYWZeBS-oYDYSLNks5u3w6GgaJpZM4ZCilz .
@SilasK Completed in https://github.com/Micromeda/pygenprop/pull/33
Still on the develop branch. I'm going to be working on documentation.
Summary can be found here.
There are some difference, however, these are due to assign_genome_properties.pl
not working correctly.
If I understand your code correctly you can parse the long format output of
assign_genome_properties.pl
from the genome properties, but there is no script to infer the genome properties from the output of interposcan directly.