Universal Service Description Ontology

jeswr commented 2 years ago

WARNING I'm still in the process of fleshing this out EDIT To add links suggested in comments

Pitch

Applications and APIs need a consistent way of advertising their capabilities and how to use them. This is so clients can intelligently interoperate with these services and select the best way of using the service given the capabilities for the client.

For instance, in the time series research challenge - the server might advertise that it has a

SPARQL 1.1 endpoint, and,
A direct InfluxDB endpoint

The client can then request a document that contains information about which services the server provides. If the client directly supports use of the InfluxDB API, then it can use that for faster/more performant access - and otherwise will default to use the SPARQL 1.1 endpoint to retrieve the timeseries data in a more verbose manner.

Further to this, there is a need to specify more specific capabilities, or parameterise those capabilities (such as the time & network cost of each option - these values may also change over time depending on server load etc.). For instance, Comunica could incrementally specify each of the experimental SPARQL 1.2 protocols that it supports as those capabilities get added. It could also advertise that the SPARQL engine configuration supports things like reasoning, geosparql etc. and the client can then make use of the particular 'sub-services' it needs.

This should all be designed so that services can advertise a general service (such as SPARQL 1.1) or a more specific service (SELECT with BGP). When reasoning is applied, this can be used for clients to ask about more specific services they provide (SELECT with BGP, SELECT with UNION).

As an extension think it would also be really nice for this to describe the semantics of different RDF syntaxes (basically extending the concept of machine-readable grammars like jison with the ability to describe how the grammar maps to triples) so that parsers can be completely auto-generated based on the spec.

Desired solution

An ontology, in addition to client/server (or more generally 2 communicating agents) implementations that can use the ontology to negotiate their capabilities to find the best way of jointly performing an action. This includes Querying, Reasoning, Query with a particular reasoning profile applied to the data etc.

Acceptance criteria

A published ontology for Universal Service Description
An agent implementation such that multiple agents can communicate using the terms in the ontology to find the best way of jointly performing an action
To tie this into an existing research goal, one outcome would be to allow a client and server to discover that they can perform Dialogical reasoning based on each of their individual capabilities https://github.com/SolidLabResearch/Challenges/issues/22.

Pointers

possibly related https://www.ripublication.com/ijaer17/ijaerv12n16_64.pdf
very related to this paper from a few years ago: https://linkeddatafragments.github.io/Article-Declarative-Hypermedia-Responses/
https://github.com/w3c/sparql-12/issues/152
https://linkeddatafragments.org/

Scenarios

I've got a few more sporadic notes on this topic over in this gist

rubensworks commented 2 years ago

Yes, we definitely need this!

Sounds very related to this paper from a few years ago: https://linkeddatafragments.github.io/Article-Declarative-Hypermedia-Responses/

And also https://linkeddatafragments.org/ of course.

RubenVerborgh commented 2 years ago

Can we apply this to a concrete case? Service description in general is an unfinishable problem 🙂

jeswr commented 2 years ago

My view is that the main product would be an ontology/vocabulary that defines terms with which any service can then be described. These terms should then be able to describe any use case we can imagine - in particular, it should be able to accurately describe the following concrete services:

A SPARQL endpoint with a subset of SPARQL 1.2 and Geosparql features.
A QPF endpoint
An influxDB endpoint

There should also be an implementation of the ontology that enables agents to negotiate using these descriptions to perform a query (I don't think it is necessary to define the exact query now?) as fast as possible given the features that each agent supports.

I'm also in the phase of thinking about how this could be extended to describe how to perform some actions with and endpoint rather than just what actions they perform. But I believe this is best left to a separate issue which I will open later.

ThisIsMissEm commented 2 years ago

Now that the Solid Protocol specification has a Storage Description interface, is this still something orthogonal / separate, or would these all just be predicates in that document? https://solidproject.org/ED/protocol#server-storage-description

jeswr commented 2 years ago

Now that the Solid Protocol specification has a Storage Description interface, is this still something orthogonal / separate, or would these all just be predicates in that document? https://solidproject.org/ED/protocol#server-storage-description

This is largely orthogonal to the terms defined in that document. The goal of the Universal Service Description would be to provide a fine-grained description of the capabilities that different Solid/Semantic Web 'API's offer both on Solid servers, and in other services such as aggregators etc.

On the other hand the document server-storage-description largely refers to describing the location of storage on a Solid server.

ThisIsMissEm commented 2 years ago

@jeswr from my understanding, the server-storage-description will be augmented by other parts of the solid specs, so for instance, notifications would use that document to say where the notification server for that storage server is, same could be true for Authorization (ACP/ACRs), other query endpoints, etc.

VladimirAlexiev commented 10 months ago

@jeswr

A SPARQL endpoint with a subset of SPARQL 1.2 and Geosparql features.

Yes!

https://opengeospatial.github.io/ogc-geosparql/ "GeoSPARQL 1.1 Service Description" includes 3 files that use SD but the way these sd:features are described can be elaborated significantly.
https://github.com/opengeospatial/ogc-geosparql/issues/481 discusses how to elaborate the SD of GeoSPARQL
In particular GeoSPARQL features and conformance classes make a hierarchy, but SPARQL SD doesn't have hierarchies
https://github.com/w3c/sparql-dev/issues/130 asks for easier ways to extend SPARQL servers with datatypes. Describing datatypes in SD is very relevant for GeoSPARQL that captures geometries in special datatyped literals: geo:wktLiteral, geo:gmlLiteral etc.

the server might advertise that it has a SPARQL 1.1 endpoint, and A direct InfluxDB endpoint

We've had several projects where we use GraphDB and Influx to capture timeseries: the location and metadata in RDF, and the actual data in Influx. I call such strategies "hybrid storage", and similar are used in GraphDB connectors, eg to Elastic and to MongoDB. We're now also thinking about a connector for binary (engineering/scientific) data The key question is how to distribute data between RDF and the "other" storage, and how to describe the relations between storages. GraphQL Federation and Mesh can describe such integrations in a formal way (eg if entity E1 is stored in S1, it is possible that some of its props E1.p lead to an entity E2 stored in S2). Is this in scope of Universal SD?

jeswr commented 10 months ago

Is this in scope of Universal SD?

I don't think anyone is currently working on the Universal SD (nor do I know of any plans for this to be worked on), so what is in scope is undetermined.

phochste commented 10 months ago

Signposting people are working on something which fits universal service description, but it is targeted to programmers finding their way. In general, their view is that a description should be stratfied: at least provide humans a way to discover the affordances of a service and if possible adding more and more machine processable information.

SolidLabResearch / Challenges