Open pchampin opened 4 years ago
For Oxigraph I have build a SPARQL parser and a SPARQL algebra representation.
Algebra: https://github.com/Tpt/oxigraph/blob/master/lib/src/sparql/algebra.rs Parser: https://github.com/Tpt/oxigraph/blob/master/lib/src/sparql/sparql_grammar.rustpeg Parser invocation: https://github.com/Tpt/oxigraph/blob/master/lib/src/sparql/parser.rs
It might be interesting to build it as a separated crate and make Sophia and Oxigraph depend on it, just like Rio.
The parser is a bit slow at the moment, I am planning to rewrite it using a more efficient parsing library, probably nom. But I plan to make a working 0.1 release of Oxigraph first.
Thanks @Tpt for chiming in.
It might be interesting to build it as a separated crate and make Sophia and Oxigraph depend on it, just like Rio.
I have considered this. But as I mentioned above, in sophia it would be more natural to reuse the graph::Graph
to represent basic graph patterns, so this might not be the smoother way to go... I'm still open to ideas, though. The more work we can mutualize, the better.
The more work we can mutualize, the better.
Huge +1. We could maybe have the parser in a separated crate with a fairly cheap algebra representation. Then Sophia could expose an easy to use algebra tree on top of it and Oxigraph could build from it its query plans.
An other way to go would be to have an "rdf-api" crate similar to what RDF/JS is doing for the RDF models and its commons extensions. And have Oxigraph and Sophia and hopefully the other RDF related libraries in Rust use it. But it might be hard to build a nice and efficient API without GAT.
An other way to go would be to have an "rdf-api" crate similar to what RDF/JS
This should probably be discussed in a separate thread. I created #23 for this. And yes, GAT would be a huge help in this direction.
Isaac Newton invented calculus while quarantined, so I guess I can write a SPARQL parser? I've used the oxigraph library quite a bit, and I like it, but the slowest part of it is its parser (acknowledged by @Tpt). I guess I can just start writing one and then ask for feedback? I'd like to to parse into a common AST. I guess using the oxigraph algebra is sufficient?
@dwhitney that would awesome... :-)
Sophia has evolved quite a bit in the meantime, in order to be more usable as a common API for RDF in Rust. @Tpt and I have agreed (in a discussion offline) that a good way forward would be to extract oxigraph's SPARQL parser and AST into a separate crate, using sophia's Term
type as a building block.
FYI, the Term
type is currently being refactored (#47, #48, #49). Once this is finished (Literals still need to be done), I plan to extract it into a separate crate sophia_term
, so that crates using it, such as this new SPARQL crate, would not end up importing the whole of sophia.
@dwhitney Great! Thank you! I have done some changes in the parser that have significantly improve the speed of the current parser (migration to peg 0.6 and avoiding duplicate parsing).
Haven't had as much time as I'd like to look at this (still working from home). I found this parser. Have either of you taken a look at it? https://github.com/mattsse/nom-sparql
@dwhitney I was not aware of this paper, thanks! There seems to be no code related to testing the parser against the W3C testsuite, I don't know how correct it is.
Yeah if your parser's performance has increased enough, perhaps there is no need to invest the time in a new one, but before you made your improvements, parsing was often the slowest part of the query by several orders of magnitude.
On Sun, Mar 22, 2020, 2:22 PM Thomas Tanon notifications@github.com wrote:
@dwhitney https://github.com/dwhitney I was not aware of this paper, thanks! There seems to be no code related to testing the parser against the W3C testsuite, I don't know how correct it is.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pchampin/sophia_rs/issues/19#issuecomment-602250770, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAIFIIH7HSAOK7GVOURAT3RIZJO3ANCNFSM4JY2PGVA .
I was also thinking of implementing a SPARQL parser with nom before I found https://github.com/mattsse/nom-sparql which was mentioned by @dwhitney.
It says it's a WIP in the README and there was no development for more than a year. I wonder whether @mattsse would be willing to adapt it to fit into sophia
, allowing it, via abstraction, to reuse some types and code. If he only meant it as a WIP which he doesn't want to maintain, maybe he'd be willing to have it adapted and included in sophia
.
A question for @pchampin is whether having nom
as a dependency is acceptable.
Hi there 👋
it's been a while since I've worked on it. It was a small side project I only hacked on for few weeks. Unfortunately I did not finish it so far that I was pleased with it and felt good about publishing it and moved on... 🙈
Currently I've got some time on my hands and if that crate could be useful for sophia
I'd be willing to adapt/donate it. So long as nom
as dependency is acceptable.
There seems to be no code related to testing the parser against the W3C testsuite
I wasn't aware that there is a test suite, If you can point me to where i can find it, I'd be happy to test against it @Tpt
fwiw the parser should be already feature complete(ish), so the most work would probably be
Display
for every type, so that a roundtrip (string -> sparql -> string) is supportednom
0.6. (is still in alpha)I'm not the owner of sophia
so take this with a grain of salt.
The aim of sophia
is to provide a common API for RDF in Rust (#23), therefore, it is not intended to include a parser in sophia
(the current implementations are more or less artefacts from before the split into several crates). A more fitting approach would be to develop a SPARQL API for sophia
, i.e. a bunch of traits, base types and core functionality. So that third party crates, like nom-sparql
can implement a parser against this API. In the end this should allow users to pick a parser that fits their needs best. In addition, this means that an implementation of a SPARQL engine is not required to include a parser.
@yever, @mattsse Nice to see new people working on sophia
and its ecosystem :+1:
@mattsse You can find out about the SPARQL test suite here: https://www.w3.org/2009/sparql/docs/tests/README.html
@mattsse The recent versions of the test suite are here: https://github.com/w3c/rdf-tests/tree/gh-pages/sparql11 I use this repository as a git submodule in Oxigraph in order to be able to get quick feedbacks (<1s for the full SPARQL test suite). Here is my testsuite evaluation code: https://github.com/oxigraph/oxigraph/blob/master/testsuite/src/sparql_evaluator.rs It contains also support of query and update evaluation tests.
Oxigraph already has Display
implementations: https://github.com/oxigraph/oxigraph/blob/master/lib/src/sparql/algebra.rs via the Sparql* structs (the default display prints the algebra notation). During testing I check that the -> serialized -> parser returns the same tree.
@yever asked
A question for @pchampin is whether having nom as a dependency is acceptable.
and @MattesWhite replied
The aim of
sophia
is to provide a common API for RDF in Rust (#23), therefore, it is not intended to include a parser insophia
.
To be more precise: the sophia_api
crate aims to provide a common API. Other crates in the sophia
repo are intended to provide some implementation of said API (e.g. sophia_term
provides an implementation of the trait TTerm
) but of course the goal is to keep the ecosystem open (e.g. Oxigraph is now implementing that API). Finally, the sophia
crate is gradually becoming a "compilation" of other crates, including sophia_api
and sophia_term
. Eventually, the code it contains will move into more specialized crates (sophia_X
), and the sophia
crate itself will only be a bunch of pub use
from those specialized crates.
Now regarding SPARQL support, the first step would be to add new traits in sophia_api
, related to SPARQL management. Off the top my head, I imagine
SparqlDataset
trait (deriving from Dataset
), providing a prepare_query
method, returning a SparqlQuery
;SparqlQuery
trait, providing a execute
method, returning a SparqlResult
;SparqlResult
trait, providing a number of methods for interacting with the different kinds of results SPARQL can produce (SELECT, CONSTRUCT/DESCRIBE, ASK).Then one or several implementations of these traits could be provided. For Oxigraph, this would amount to simply adapt the existing types to the traits above. But a generic implementation of SparqlDataset
, able to resolve queries against any type implementing Dataset
, would be nice too... This one could benefit from the nom-based parser by @mattsse.
I hope this clarifies things.
Thanks @pchampin. This fits my expectations. I noticed that sophia
is gradually being modularized and I like this development.
I thought that the nom
-based parser could maybe be included in the workspace (making in it an optional dependency for users of sophia
) because it was created as a side project and not yet published into crates.io
.
I agree that creating the relevant traits for SPARQL would be a good first step that makes a lot of sense. In fact, Oxigraph
and nom-sparql
can be 2 integration test scenarios for these traits.
I'll try to see how far I can go with implementing these traits and raise a pull request if I have something presentable.
I have written a SPARQL parser with treesitter. Treesitter is very fast and has bindings for rust. Maybe this is of use.
I have a very early implementation of a SPARQL engine for Sophia: https://github.com/pchampin/sophia_sparql
It should be integrated in v0.9 (but it might not be fully compliant by that time).
NB: since sophia uses a generalized RDF model (including variables), a Graph can also be used as a basic graph pattern. The query module contains a preliminary implementation of this idea.