ActiveTriples / linked-data-fragments

Basic linked data fragments endpoint.
Creative Commons Zero v1.0 Universal
15 stars 0 forks source link

Linked Data Fragments Query Result format discussion: #40

Open scande3 opened 7 years ago

scande3 commented 7 years ago
  1. It appears there is no way to have Rails not turn a "?" into a url parameter. So would the following make sense for a query pattern that follows normal SPARQL syntax for all labels that start with "Las":

http://localhost:3000/?q=?o ?s "^Las*"

  1. For case insensitive string matching, does this then rely on having (?i) at the beginning of the string, it is a header field that can be sent, or does it just assume case insensitive matches?

http://localhost:3000/?q=?o ?s "(?i)^Las*"

  1. For the Repository in-memory interface, how does one do a SPARQL query against that graph? Is there an example @no-reply can link me to on how to do case insensitive string matching for the Repository interface? I may just be looking in the wrong place for that type of example.

  2. For the current subject resolution, should that remain the same? Or to be more consistent with the new query interface, should that be:

http://localhost:3000/?s=<subject uri>

no-reply commented 7 years ago

It appears there is no way to have Rails not turn a "?" into a url parameter.

Is there a reason we want to avoid this? Wouldn't that break with the URL/URI/IRI specs?

For the Repository in-memory interface, how does one do a SPARQL query against that graph?

Requiring sparql will extend RDF::Queryable to make all queryables accept a SPARQL Algebra Operator. See: http://www.rubydoc.info/github/ruby-rdf/sparql/RDF/Queryable#query-instance_method for details on that.

The cleanest interface is documented at: http://www.rubydoc.info/github/ruby-rdf/sparql/SPARQL#execute-class_method

scande3 commented 7 years ago

Is there a reason we want to avoid this? Wouldn't that break with the URL/URI/IRI specs?

The main reason is just that we haven't been using URL parameters in the application thus far. For example, to resolve a URI, one does: http://localhost:3000/http://dbpedia.org/resource/Berlin?format=jsonld

I wanted to keep the query interface consistent with the subject resolving endpoint. Hence the last comment on if we are going to use url parameters for the query, then should the subject resolver now be: http://localhost:3000/?s=http://dbpedia.org/resource/Berlin&format=jsonld

Thanks for the clarification on how to do SPARQL queries against a Repository endpoint!

gkellogg commented 7 years ago

IMHO Values of query parameters should be URI escaped, for exactly this reason. The server, when extracting the parameters would then URI unescape to get the actual query.

no-reply commented 7 years ago

The main reason is just that we haven't been using URL parameters in the application thus far. For example, to resolve a URI, one does: http://localhost:3000/http://dbpedia.org/resource/Berlin?format=jsonld

This is not my understanding of the current implementation. Don't do something more like: http://localhost:3000/?subject=http://dbpedia.org/resource/Berlin?format=jsonld? See: https://github.com/ActiveTriples/linked-data-fragments/blob/master/app/controllers/subject_controller.rb#L24

If the goal is just to support sparql syntax, I think we could probably mount a SPARQL endpoint at /sparql, either by mounting the existing SPARQL Sinatra service, or by adapting it slightly to run on Rails natively.

I'm still inclined to support something like a strict lucene subset, but am working on figuring out client side needs for my project now.

scande3 commented 7 years ago

This is not my understanding of the current implementation.

My sample URL there is from the README of this project. You are correct that your example would also function but it wasn't the preferred url format to use at this point. Hence the discussion on changing the documented way to resolve uri's to rely on url parameters to be more consistent if the query interface would require using them.

If the goal is just to support sparql syntax, I think we could probably mount a SPARQL endpoint at /sparql.

I believe the goal is to support Linked Data Fragments, not SPARQL. The Linked Data Fragments application needs to translate the fragments to a query the underlying cache layer can understand. Unless there is a Linked Data Fragments compliant endpoint that supports fuzzy string matching for Marmotta and Blazegraph that I am unaware of, that means turning the queries into SPARQL that those caching layers understand.

In the future, we can have backends that don't know SPARQL or Linked Data Fragments that might require a different translation. Furthermore, the point of Linked Data Fragments is to reduce complexity and allow for better caching of queries on the server side. But beyond that, I'm not fully opposed to a SPARQL passthrough endpoint beyond supporting the Linked Data Fragment interface though.

no-reply commented 7 years ago

I believe the goal is to support Linked Data Fragments, not SPARQL.

Can you clarify what you mean by this? Is there a Fragment spec you have in mind? I understood that this search discussion was targeted at defining such a spec.

If we're translating to SPARQL under the hood, I think we need to be really specific about what complexity is reduced by introducing a new syntax; SPARQL Construct is already defined as a fragment and implemented.

Are there use cases here that aren't covered by extending either SPARQL Construct or Triple Pattern Fragments to support lucene style fuzzy searches?

mjsuhonos commented 7 years ago

I believe every example of triple pattern fragment queries I have seen has used the following URI syntax (note the URI escaped parameter values):

http://data.linkeddatafragments.org/?subject=&predicate=&object=%22DBpedia+2014%22.

However, the TPF spec provides some alternative examples. I believe unspecified values (ie. an "open match" pattern like s?) are simply left empty.

As for fuzzy string matching, you may want to look at the interfaces beyond TPF -- in particular, Substring Filtering for Low-Cost Linked Data Interfaces has a discussion of suggested free text search fragment syntax.

no-reply commented 7 years ago

In the current implementation, the base hydra:Dataset advertises a variable ?subject (see: Hydra: Templated Links). The alternative syntaxes pointed to by @mjsuhonos could also be specified in similar templates.

Unfortunately, the Substring Filtering paper's proposed syntax is just ?substring=, and neither it nor the Hydra Core vocabulary seem to offer us a nice way of doing substring retrieval in combination with pattern matching.

I'm looking at this closer, but for now, the syntax I would propose is:

http://example.com/dataset/{?subject}{?predicate}{?objectQuery}

as in:

http://example.com/dataset/?predicate=skos:prefLabel&objectQuery=*a%20lucene%20phrase*

or:

http://example.com/dataset/search/{subject}/{predicate}/{object}

Where all the variables are support some kind of lucene like phrase; as in:

http://example.com/dataset/search/*/skos:prefLabel/*a%20lucene%20phrase*

It's even less clear to me how to express this latter example in the Hydra Core Vocab.

hackartisan commented 7 years ago

Note that @no-reply's Feb 10 comment documents the approach we all agreed on. @scande3's pushed his work in progress branch: https://github.com/ActiveTriples/linked-data-fragments/tree/feature/ldf_query_experimental. Anyone interested in continuing this work could start from there. He commented that this branch "has an updated README. I tested it again locally so let me know if there are problems if you try it. Note that it gave me an error in Ruby 2.4 for some reason so I switched back to my older Ruby 2.2.x version"