Closed rmflight closed 1 year ago
OK, just to make sure I did this right. Note, I did test this and made sure it works, but before committing and pushing, I figured I would double check the code fits what we discussed.
here is my changes to main.py
def main(args):
if args['create_subgraphs']:
gocats.create_subgraphs(args)
elif args['categorize_dataset']:
gocats.categorize_dataset(args)
elif args['remap_goterms']:
go_database = args['<go_database>']
goa_gaf = args['<goa_gaf>']
ancestor_filename = args['<ancestor_filename>']
namespace_filename = args['<namespace_filename>']
if args['--allowed_relationships']:
allowed_relationships = args['--allowed_relationships'].split(",")
else:
allowed_relationships = ["is_a", "part_of", "has_part"]
if args['--identifier_column']:
identifier_column = int(args['--identifier_column'])
else:
identifier_column = 1
gocats.remap_goterms(go_database, goa_gaf, ancestor_filename, namespace_filename, allowed_relationships, identifier_column)
And then in gocats.py
def remap_goterms(go_database, goa_gaf, ancestor_filename, namespace_filename, allowed_relationships, identifier_column):
"""Reads in a Gene Ontology relationship file, and a Gene Annotation File (GAF), and
follows the GOcats rules for allowed term-to-term relationships. Generates as output
a new GAF, and a new term to ontology namespace mapping.
:param go_database: the gene ontology dataset
:param goa_gaf: the gene annotation file
:param ancestor_filename: the output file containing new gene to ontology mappings
:param namespace_filename: the output file containing the term to ontology mappings
:param allowed_relationships: what term to term relationships will be considered (is_a,part_of,has_part)
:param identifier_column: which column is being used for the gene identifiers (1)
:return: None
:rtype: :py:obj:`None`
"""
if type(allowed_relationships) == str:
allowed_relationships = allowed_relationships.split(",")
if type(identifier_column) == str:
identifier_column = int(identifier_column)
.... continuing, and using the variables instead of args
The "if type(allowed_relationships) == str" and similar testing in the function is not needed, since the API and CLI are cleanly separated.
Besides this, it is perfect!
On Wed, Jun 14, 2023 at 4:28 PM Robert M Flight @.***> wrote:
OK, just to make sure I did this right. Note, I did test this and made sure it works, but before committing and pushing, I figured I would double check the code fits what we discussed.
here is my changes to main.py
def main(args): if args['create_subgraphs']: gocats.create_subgraphs(args) elif args['categorize_dataset']: gocats.categorize_dataset(args) elif args['remap_goterms']: go_database = args['
'] goa_gaf = args[' '] ancestor_filename = args[' '] namespace_filename = args[' '] if args['--allowed_relationships']: allowed_relationships = args['--allowed_relationships'].split(",") else: allowed_relationships = ["is_a", "part_of", "has_part"] if args['--identifier_column']: identifier_column = int(args['--identifier_column']) else: identifier_column = 1 gocats.remap_goterms(go_database, goa_gaf, ancestor_filename, namespace_filename, allowed_relationships, identifier_column)
And then in gocats.py
def remap_goterms(go_database, goa_gaf, ancestor_filename, namespace_filename, allowed_relationships, identifier_column): """Reads in a Gene Ontology relationship file, and a Gene Annotation File (GAF), and follows the GOcats rules for allowed term-to-term relationships. Generates as output a new GAF, and a new term to ontology namespace mapping. :param go_database: the gene ontology dataset :param goa_gaf: the gene annotation file :param ancestor_filename: the output file containing new gene to ontology mappings :param namespace_filename: the output file containing the term to ontology mappings :param allowed_relationships: what term to term relationships will be considered (is_a,part_of,has_part) :param identifier_column: which column is being used for the gene identifiers (1) :return: None :rtype: :py:obj:
None
"""if type(allowed_relationships) == str: allowed_relationships = allowed_relationships.split(",") if type(identifier_column) == str: identifier_column = int(identifier_column)
.... continuing, and using the variables instead of args
— Reply to this email directly, view it on GitHub https://github.com/MoseleyBioinformaticsLab/GOcats/pull/23#issuecomment-1591930710, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADEP7B3MJTGK7W4LPO7T3MDXLINFJANCNFSM6AAAAAAY4OHIXY . You are receiving this because your review was requested.Message ID: @.***>
Email: @. (work) @. (personal) Phone: 859-218-2964 (office) 859-218-2965 (lab) 859-257-7715 (fax) Web: http://bioinformatics.cesb.uky.edu/ Address: CC434 Roach Building, 800 Rose Street, Lexington, KY 40536-0093
I think it's done. I've tested it all, and everything seems to be working, and the docs are all up to date.
I've taken the
build_ancestors
script that was done for the statistical power manuscript (and which I use regularly), and added it as a sub-command for thegocats
CLI.This way it's part of gocats itself, and only depends on installing the package.
I've compared the outputs of this to the
build_ancestors
script, and although things are output in different orders, the contents of the gene-term mapping are identical, as well as the term-ontology mapping.I have not updated the package documentation. I wanted to make sure the names of CLI arguments I've chosen make sense. It was hard to figure out what the argument names for the other CLI sub-commands went with, as I've not used the CLI for
gocats
that much.