This ticket concerns a new feature for the recommendation system, where the type of the used endpoint service is determined automatically, so that users do not have to manually specify the type of the endpoint beforehand in the configuration file.
Background
Currently, a config file is used to load specified endpoints. This configuration file allows users to specify endpoints with a key name (instead of the entire URL) in the CLI, and the key type which removes the need to specify/infer the type of that endpoint. However, we would like to infer the endpoint's type automatically. For now the type of the endpoint can either be sparql or elasticsearch.
In the past, we already explored several methods that only work for TriplyDB:
Regular expression: In the archived branch old-multi-endpoint, we implemented a simple endpoint service recognition which only works for endpoint URLs in Triply's format. The previous function returnEndpointService() was implemented in endpointExtractor.ts and used in recommend.ts. The function only returns the endpoint service extracted from the URL with regex. This method does not work for URLs with another structure and/or URLs that do not include the service type.
TriplyDB.js: It is possible to view all possible service types for an endpoint with TriplyDB.js (see the archived endpointExtractor.ts)
GET request: A GET request to the Triply endpoint URL (https://api.INSTANCE/datasets/ACCOUNT/DATASET/services/SERVICE/) should return a JSON object with a capabilities property, this property contains an array of available services.
Generic methods:
ASK query: Sending a SPARQL POST query ask {} with Content-Type: application/sparql-query should work for all SPARQL endpoints but not others such as Elasticsearch, this could help discern the type of endpoint service.
We did not yet find a generic way to identify Elasticsearch endpoints yet, this seems to be the main challenge for this issue.
Goal
The aim is to find a global endpoint service type recognition method that works for any endpoint, so it also works for non-Triply endpoints.
A generic method to discern endpoint types for any endpoint.
The vocabulary recommender behaves the same when removing service types from the endpoint configuration file.
Ability to handle cases for which no service type can be identified. In case no type can be identified, throw an error and output the error message: "No service type can be recognised for the endpoint ".
A pull request with a function that is able to recognise the service type of any endpoint.
This ticket concerns a new feature for the recommendation system, where the type of the used endpoint service is determined automatically, so that users do not have to manually specify the type of the endpoint beforehand in the configuration file.
Background
Currently, a config file is used to load specified endpoints. This configuration file allows users to specify endpoints with a key name (instead of the entire URL) in the CLI, and the key type which removes the need to specify/infer the type of that endpoint. However, we would like to infer the endpoint's type automatically. For now the type of the endpoint can either be
sparql
orelasticsearch
.In the past, we already explored several methods that only work for TriplyDB:
Regular expression: In the archived branch
old-multi-endpoint
, we implemented a simple endpoint service recognition which only works for endpoint URLs in Triply's format. The previous functionreturnEndpointService()
was implemented in endpointExtractor.ts and used in recommend.ts. The function only returns the endpoint service extracted from the URL with regex. This method does not work for URLs with another structure and/or URLs that do not include the service type.TriplyDB.js: It is possible to view all possible service types for an endpoint with TriplyDB.js (see the archived endpointExtractor.ts)
GET request: A GET request to the Triply endpoint URL (
https://api.INSTANCE/datasets/ACCOUNT/DATASET/services/SERVICE/
) should return a JSON object with a capabilities property, this property contains an array of available services.Generic methods:
ask {}
withContent-Type: application/sparql-query
should work for all SPARQL endpoints but not others such as Elasticsearch, this could help discern the type of endpoint service.Goal
Implementation criteria