cambridge-cares / TheWorldAvatar

A knowledge-graph-based digital twin of the world.
https://theworldavatar.io/
MIT License
87 stars 25 forks source link

Pass credentials when running federated SPARQL queries #551

Open mdhillman opened 1 year ago

mdhillman commented 1 year ago

Current behaviour in the JPS base library, is that when connecting to a triplestore an instance of the RemoteStoreClient class is created. This instance holds the credentials (username and password) needed to access the KG.

Within the FeatureInfoAgent, this client instance is then used to execute a federated query (using the executeFederatedQuery() method). This method takes a number of endpoint URLs (with no restriction that they are in the same stack, or share the same credentials) and runs the query on all of them; in this case the URLs happen to belong to the same Blazegraph instance. The issue is that this method does not use the credentials stored in that instance, leading to no way to run a federated query on protected endpoints.

Short term solution is to pass the instance's credentials to any URL it contacts. In the longer term, unless the aforementioned restriction to the input URLs is implemented, would it not make more sense to move the ability to run federated queries up a level, and have it take a list of RemoteStoreClient instances (rather than URLs), each containing a URL and the credentials for it?

Note: adding Seb to sanity check the suggested approach.

sm453 commented 1 year ago

Passing RemoteStoreClient instances rather than URLs sounds like a good solution to me (and preferable as a matter of principle).

gpeb2 commented 1 year ago

As mentioned in the duplicate issue #740, it looks like the solution would be to create a RepositoryManager and add the required endpoints to it along with any credentials. The RepositoryManager can then be used to create a DefaultRepositoryResolver which can be passed to a FedXFactory to create the federated repository.