This PR fixes some existing issues with processing SIF input networks. It also extends the ways in which custom SIF input networks can be specified with the --network_source argument as follows:
sif: In this case, the input SIF is assumed to contain only gene-gene
relations. GO annotations for genes, as well as relations between
GO terms are added automatically by GeneWalk.
sif_annot: In this case, the input SIF is assumed to contain both
gene-gene relations, and GO annotations for genes (i.e., rows where the
source is a gene, and the target is a GO term). Relations between
GO terms are then added automatically by GeneWalk.
sif_full: In this case, the input SIF is assumed to contain all
relations including gene-gene relations, GO annotations for genes,
and relations between GO terms. GeneWalk doesn't add any further
edges.
As before, the SIF file is still expected to represent human genes using their HGNC symbols, so in itself, this change doesn't yet allow running GeneWalk on non-human networks, though it's an important step towards that.
The PR also adds a number of new tests to test workflows with user-supplied SIF networks.
This PR fixes some existing issues with processing SIF input networks. It also extends the ways in which custom SIF input networks can be specified with the
--network_source
argument as follows:sif
: In this case, the input SIF is assumed to contain only gene-gene relations. GO annotations for genes, as well as relations between GO terms are added automatically by GeneWalk.sif_annot
: In this case, the input SIF is assumed to contain both gene-gene relations, and GO annotations for genes (i.e., rows where the source is a gene, and the target is a GO term). Relations between GO terms are then added automatically by GeneWalk.sif_full
: In this case, the input SIF is assumed to contain all relations including gene-gene relations, GO annotations for genes, and relations between GO terms. GeneWalk doesn't add any further edges.As before, the SIF file is still expected to represent human genes using their HGNC symbols, so in itself, this change doesn't yet allow running GeneWalk on non-human networks, though it's an important step towards that.
The PR also adds a number of new tests to test workflows with user-supplied SIF networks.