Closed adinapoli-mndc closed 5 years ago
:+1: A good list! Though some of this might be heavily influenced by decisions of the Graph API. Additionally you could list switching to a multi-edge graph as outlined in the data model document.
Fixed by https://github.com/oscoin/osrank-rs/pull/59 .
Once #33 will be completed, I think this will be a good time for us to take stock and trying to see if we can refactor
algorithm.rs
. There are a number of things which are currently a bit clunky in that module:We are not using (not even attempting to) the
GraphAlgorithm
trait we spec-ed out in the end2end document. I am not saying we necessarily should, but trying to actually implement a concrete instance for it (which usesosrank_naive
under the hood) would be a good testbed to understand whether or not theGraphAlgorithm
trait has a leg to stand on. Otherwise, that's just a design exercise in a vacuum without any practical application, and it would be even misleading for people trying to implement the storage-specific parts of it (like @kim);We are using that slightly weird
get_weight
andfrom_osrank
closures that I do wonder how long will be before they will bite us in the back;We are not trying to validate the ideas of the GraphAPI Working Group by trying to have a
GraphDataWriter
trait that modifies the graph metadata at the end of the whole process;Our
Graph
trait is currently quite heavy (in includes 11 methods and counting) which means we cannot define other implementations for it. If theGraph
trait was modelled similar to the one sketched in the working group by @cloudhead , we would be able to implement aimpl Graph for GraphView
whereGraphView
could just be a set of references to some nodes, we could later to traverse the graph efficiently (with no copy needed).My tentative proposal would be the following:
We give it a shot to bring our current
graph.rs
traits closer to what the working group have created so far; as there is a bit of uncertainty on the entire business of layers and/or theGraphObject
, we don't need to get it right immediately; probably starting by having aGraph
,GraphWriter
andGraphDataWriter
should be enough;Introduce the concept of a
GraphView
which is just a set of references to some node ids, plus an immutable reference to the underlying graph, so that we can easily implements things likesubgraph_by_nodes
without cloning;Try to bring into the mix the concept of
GraphAlgorithm
andMutableGraphAlgorithm
. Ideally the seed set phase would yield aGraphView
and can be implemented as aGraphAlgorithm
, whereas the actualosrank_naive
would need to be aMutableGraphAlgorithm
that also depends on aGraphDataWriter
of some sort;A note on RandomWalks
I have realised at some point this week, while I was working on the
RandomWalks
new abstraction, that for this data structure we have to copy the node ids. This is because we cannot rely on the lifetime of the inputGraph
G
, because in the real world(TM) we would be given some form of serialised-from-diskRandomWalks
type, and therefore this needs to own its content: not just from an ownership perspective, but also because by the time we receive this data structure as input, the network topology might have changed: new nodes might have been added, and stale nodes (still referenced by the RandomWalks) might be gone, therefore we cannot rely on the Rust lifetime system here.Not that I am writing this, I am also thinking that perhaps
RandomWalks
should be somehow abstracted over as a trait, because it's unlikely that the storage & protocol teams will depend on our own type: we will probably be given some form of iterator or some more specific data structure and we will have to either perform the conversion on-the-fly, or useRandomWalks
as it is now only for tests.@MeBrei thoughts?