BioJulia / GenomeGraphs.jl

A modern genomics framework for julia
https://biojulia.dev
Other
67 stars 8 forks source link

Typo in Links.jl #2

Closed ardakdemir closed 4 years ago

ardakdemir commented 5 years ago

I think "is_forward_link" function has a typo.

is_forward_link(l::SequenceGraphLink, n::NodeID) = source(l) == -n

'-n' should be replaced with 'n'.

I thought maybe there is a special reason behind this but could not figure it out.

Possible Solution / Implementation

I will push a new branch along with some other minor updates to the package.

ardakdemir commented 5 years ago

Seems to behave as expected after fixing the typo :

linktypo
ardakdemir commented 5 years ago

Pushed the changes to a new branch named 'gsoc' in a copied repository in:

"https://github.com/ardakdemir/my_bioseqgraph/tree/gsoc"

TransGirlCodes commented 5 years ago

Hi Arda, I'll double check my groups C++ implementation, but I don't think that's a bug.

In the generic sequence graph. Nodes have a sequence (always the canonical version of the sequence), a source end (+), and a sink end (-).

--------------------
| + | ATCGATCG | - |
--------------------

Links can connect nodes from either end. So a link could connect the sink of a node X to a source of node Y. A link could also connect a source of X to a source of Y, and so on.

Setting up nodes and links this way means it acts like a flow graph: As you move through a graph, (like in a breadth first search for example), you can go through a node forwards or backwards. So let's say you've been travelling through the graph, the last link took you from the (-) end of node Y, to the (+) end of node X. Now you are at node X, to continue going forward, you have to leave using the other end of X (the - end). Hence the -n in the function: Say X = 5, and I just arrived at X through the (+) end (+5,...or just 5 xD): then any links that take me forwards out of X must have a source of -5, as I need to leave using the (-) end of 5.

So the correct behaviour should be the following:

link = SequenceGraphLink(-1, 2, 0)

# I'm travelling through node 1 in the positive direction `(+)------->(-)`, (denoted as +1 or just 1), so does link let me continue forwards in the direction I was heading (i.e. is it a fw link)?
is_fw_link(link, 1)
# will be true

# I'm travelling through node 1 in the negative direction `(+)<------(-)`, (denoted as -1), so does `link` let me continue forwards in the direction I was heading (i.e. is it a fw link)?
is_fw_link(link, -1)
# will be false
ardakdemir commented 5 years ago

Dear Ben,

Thanks a lot for the detailed explanation. Maybe we can reconsider the type in the is_forward_link(l::SequenceGraphLink, n::NodeID) function as we will be calling it with integers that are not necessarily NodeID's if I am not mistaken.

TransGirlCodes commented 5 years ago

Currently, NodeID is just an alias for Int64. For our C++ code we came up with the convention that a NodeID is just an integer: and can be positive or negative, and both X and -X identify the same node, but the sign just gives some direction in cases where it is relevant such as with is_forward_link and when describing a path through the graph as a set of ids: [1, -20, 50, ...].

ardakdemir commented 5 years ago

I see, it is very helpful for me to get familiar with the conventions you have set at this stage of our GSoC project. Thanks a lot for your explanations.

ardakdemir commented 5 years ago

By the way, do you think it would be useful to add a NodeID field for the SequenceGraphNode Type defined in Nodes.jl ?

struct SequenceGraphNode{S <:Sequence}
    sequence::S
    active::Bool
end
TransGirlCodes commented 4 years ago

I'll close this issue as it's quite old now. Any fresh issues with the Graph API can be opened again. master branch and the gh-pages are getting detailed API docs of all the functions associated with the graph data type.