nodeSolidServer / node-solid-server

Solid server on top of the file-system in NodeJS
https://solidproject.org/for-developers/pod-server
Other
1.78k stars 298 forks source link

Support SPARQL GET requests as documented on solid-spec #962

Closed NoelDeMartin closed 3 years ago

NoelDeMartin commented 5 years ago

In the solid-spec repository it is documented that it's possible to use SPARQL to retrieve content as an alternative to globbing. Since this is not supported in yet, I'm opening this issue to keep track of this feature.

Source: https://github.com/solid/solid-spec/blob/master/api-rest.md#alternative-using-sparql

scenaristeur commented 5 years ago

Hi @NoelDeMartin, it seems possible with rdfjs lib / solid-auth-client https://forum.solidproject.org/t/fun-fact-using-sparql-to-query-the-type-registry/776

NoelDeMartin commented 5 years ago

I think it's possible as per the specification, but it isn't implemented on this project yet. This was confirmed by one of the maintainers in this comment: https://github.com/solid/node-solid-server/issues/956#issuecomment-440455759

Have you tried what's outlined on that forum thread using this repository as a server?

jeff-zucker commented 5 years ago

In the meantime, rdflib implements some SPARQL on the client side which can be used against data stored in Pods. See SPARQL fiddle for an API to it, or look at its guts to see examples of calling rdflib's SPARQL methods.

kidehen commented 5 years ago

You can use a SPARQL Query to crawl any Solid Pod. Remember, a Solid Pod comprises folders and files that are all described using RDF.

curl -ILH "Accept: text/turtle" https://kidehen3.solid.openlinksw.com:8444/public/CoolStuff/
HTTP/1.1 200 OK
X-Powered-By: OpenLink Solid Server
Vary: Accept, Authorization, Origin
Access-Control-Allow-Credentials: true
Access-Control-Expose-Headers: Authorization, User, Location, Link, Vary, Last-Modified, ETag, Accept-Patch, Accept-Post, Updates-Via, Allow, WAC-Allow, Content-Length, WWW-Authenticate, On-Behalf-Of, webid-tls
Allow: OPTIONS, HEAD, GET, PATCH, POST, PUT, DELETE
Link: <.acl>; rel="acl", <.meta>; rel="describedBy", <http://www.w3.org/ns/ldp#Container>; rel="type", <http://www.w3.org/ns/ldp#BasicContainer>; rel="type"
WAC-Allow: user="read",public="read"
MS-Author-Via: SPARQL
Updates-Via: wss://kidehen3.solid.openlinksw.com:8444
Content-Type: text/turtle; charset=utf-8
Content-Length: 2
ETag: W/"2-nOO9QiTIwXgNtWtBJezz8kv3SLc"
set-cookie: connect.sid=s%3Ac8xrSRXw44hr86A6C3u4gl3j5qouBxLo.YkW00xobWfsXXEIjfaFRBdDqTtG%2BhGXcelCoUP7t3ro; Domain=.solid.openlinksw.com; Path=/; Expires=Tue, 09 Apr 2019 18:07:16 GMT; HttpOnly; Secure
Date: Mon, 08 Apr 2019 18:07:16 GMT
Connection: keep-alive

Thus, you can use a SmallData-pattern to crawl any Solid Pod, simply pick your starting point. If you want to use SPARQL then your chose SPARQL Query Service needs to support de-reference of URI-variables and URI-constants in the query body as part of the solution production pipeline.

SPARQL Query example using our URIBurner Service (live Virtuoso Instance) 👍

## Remove comments to sponge-afresh

# DEFINE input:grab-all "yes"

SELECT DISTINCT ?o
WHERE {
         { <https://kidehen3.solid.openlinksw.com:8444/public/CoolStuff/>  ldp:contains ?o.
             OPTIONAL { ?o owl:sameAs ?o2 }
         }

  } 

Examples:

  1. Live Query Results Page for one of my Folders
  2. Query Defintion Page
NoelDeMartin commented 5 years ago

@kidehen I don't know what a SmallData-pattern is, you mean that there are some services (like URIBurner) that allow you to execute SPARQL queries on other servers? If that's so, I think the problem is still the same, because it will be querying the server and doing a lot of requests anyways. I guess the goal of using SPARQL on a Solid pod is to get whatever information is needed in a single request.

kidehen commented 5 years ago

@NoelDeMartin ,

URIBurner is just a live instance of Virtuoso with the Sponger Middleware Module enabled.

I am going to write a post about Small Data. Fundamentally, it is about leveraging the power of Linked Data for progressively assembling the data required for a query solution by dereferencing URI-variables and URI-constants in the body of a SPARQL Query.

This is what Linked Data has always been about i.e., rather than loading massive datasets into a DBMS and exposing access via a SPARQL Query Service endpoint.

It just so happens that this massive load and sparql endpoint combo approach has dominated the landscape for years, unfortunately!

The same thing applies to "Big Data" where the solution to challenges posed by data volume, velocity, and velocity are attempted to be addressed by distributing Databases Files (ORC, Avro, Parquet, CSV etc) over network file systems and then mapping their content to N-tuple Relations (e.g., using Hive, Presto etc) en route to SQL access.

The key point re solid is that it gives you a pod from which the content can be crawled using SPARQL, as I've demonstrated for a while.

Related:

  1. Semantic Web Client which also understands the Small Data pattern

  2. Tabulator (now the default Data Browser for all Solid Pods) which has always understood the Small Data pattern

  3. What is the Virtuoso Sponger Middleware about, and why is it important?

michielbdejong commented 3 years ago

Can this be closed?

michielbdejong commented 3 years ago

@NoelDeMartin closing as discussed in today's Solid OS meeting

NoelDeMartin commented 3 years ago

For anyone wondering, we decided to close this because SPARQL GET has been removed from the spec. So it doesn't need to be implemented anymore.