getnelson / quiver

A reasonable library for modeling multi-graphs in Scala
http://verizon.github.io/quiver/
Apache License 2.0
5 stars 3 forks source link

Quiver and banana-rdf #7

Open bblfish opened 3 years ago

bblfish commented 3 years ago

I wrote up some initial thoughts on Quiver and banana-rdf. RDF is a graph based W3C standard, and banana-rdf a scala implementation that ties a number of different Java implementations together, as well as providing its own implementation.

It would be interesting to share ideas on the subject with folks with experience using Quiver.

keynmol commented 3 years ago

Hey @bblfish, not really related to your proposal, but I saw you had problems compiling quiver - can you share more details?

I did the latest round of upgrades, and was hoping to do Scala 3 at some point as well, but 2.12 and 2.13 should work.

bblfish commented 3 years ago

Hi @keynmol,

Good to hear people are still working on quiver :-) (To tell the truth Banana-rdf was also on standby for quite some time)

I am on master at commit f4f60d1968f119a3043e55161718f9c9b16d10fb .

This is what I get:

> ─cosy@bblfish.local ~/Programming/Scala3/quiver  ‹master›
╰─➤  sbt
[info] welcome to sbt 1.3.13 (Homebrew Java 16.0.2)
[info] loading global plugins from /Users/cosy/.sbt/1.0/plugins
[info] loading settings for project quiver-build from plugins.sbt ...
[info] loading project definition from /Volumes/Dev/Programming/Scala3/quiver/project
[info] loading settings for project quiver from version.sbt,build.sbt ...
[info] set current project to quiver (in build file:/Volumes/Dev/Programming/Scala3/quiver/)
[info] sbt server started at local:///Users/cosy/.sbt/1.0/server/e7f0ab5c8e497e6a5a57/sock
[quiver] λ compile
[info] Updating
[info] Resolved  dependencies
[warn]
[warn]  Note: Unresolved dependencies path:
[error] stack trace is suppressed; run last docs / update for the full output
[error] (docs / update) sbt.librarymanagement.ResolveException: Error downloading org.tpolecat:tut-core_2.13:0.6.13
[error]   Not found
[error]   Not found
[error]   not found: /Users/cosy/.ivy2/local/org.tpolecat/tut-core_2.13/0.6.13/ivys/ivy.xml
[error]   not found: https://repo1.maven.org/maven2/org/tpolecat/tut-core_2.13/0.6.13/tut-core_2.13-0.6.13.pom
[error]   download error: Caught java.io.IOException: Server returned HTTP response code: 403 for URL: https://dl.bintray.com/tpolecat/maven/org/tpolecat/tut-core_2.13/0.6.13/tut-core_2.13-0.6.13.pom (Server returned HTTP response code: 403 for URL: https://dl.bintray.com/tpolecat/maven/org/tpolecat/tut-core_2.13/0.6.13/tut-core_2.13-0.6.13.pom) while downloading https://dl.bintray.com/tpolecat/maven/org/tpolecat/tut-core_2.13/0.6.13/tut-core_2.13-0.6.13.pom
[error] Total time: 2 s, completed 24 Aug 2021, 16:53:11

It looks like tut-core is no longer available.

keynmol commented 3 years ago

huh, didn't know Rob deleted binaries as well!

Anyways, https://github.com/getnelson/quiver/pull/8 should have a much better state

bblfish commented 3 years ago

Thanks a lot @keynmol. Those changes have made it much easier for me to browse the code in IntelliJ.

The main structures as I understand are the following.

 /** An implementation of an inductive graph where nodes of type
 * `N `are labeled with `A`, and edges are labeled with `B`. */
case class Graph[N,A,B](rep: GraphRep[N,A,B]) { ... }
type GraphRep[N,A,B] = Map[N, GrContext[N,A,B]]
case class GrContext[N,A,B](inAdj: Map[N, Set[B]],
                            label: A,
                            outAdj: Map[N, Set[B]])

The GrContext indexes the outgoing arrows from a node: N with label: A, by the node that is the target in outAdj or the source in inAdj. So to follow outAdj: Map[N, Set[B]] one would want to know the target node, and knowing that one can easily find the set of arrows Set[B] between those two nodes.

Mhh. This seems very much like Quiver is thinking in terms of HomSets.

In RDF, if one is taking a node centric, data exploration approach (as banana-rdf does), one would mostly index by the label of the arrow B, as those being URIs, have some globally accessible meaning, which software can use. So that is what we find in Plantain.Graph

case class Graph[S, P, O](spo: Map[S, Map[P, Vector[O]]], size: Int) { ...}

Here S is the Subject type, P the predicate (arrow) type and O the object type. (One can set it up so that S and O are the same too).

Note that this way of following relations is close to OO thinking, which is especially visible if the nodes have the same type. This relation between OO and Coalgebras was made very clear by Bart Jacobs inn a series of articles that came out in the 1990s such as Objects and Classes Coalgebraically. A coalgebra on a polynomial endo-functor F and a type A is a morphism X -> F[X]. When F has an exponential X^A that allows one to change the state by applying an A to get a new state X (those are setters). Here the environment has to enter an A to switch to the next state, and that is best represented by an arrow. See also recent work on the category of polynomial functors (tweet thread here).

Actually, RDF Databases allow users to set any indexes they want SPO, SOP, POS, etc... It is just that for following links, searching forward is the most common, backward less, and searching for all (known) relations between two objects is very rare.

Ideally one would be able to set the indexes one wants, and even better, the software would work out the indexes from the queries used in a program.

But this difference in indexing does not seem to be a big deal. It could be changed, and I guess even parameterized.