ad-freiburg / qlever

Very fast SPARQL Engine, which can handle very large knowledge graphs like the complete Wikidata, offers context-sensitive autocompletion for SPARQL queries, and allows combination with text search. It's faster than engines like Blazegraph or Virtuoso, especially for queries involving large result sets.
Apache License 2.0
424 stars 52 forks source link

Forward authorization header in federated queries #1602

Open BramMeerten opened 2 weeks ago

BramMeerten commented 2 weeks ago

We have a reverse proxy in front of our qlever backend servers, these are used to check if requests are authenticated, this includes authorization checks. This is done because we need fine grained security for our datasets. A claim in the JWT token indicates to which datasets the user has read access.

But when the user runs a federated query, a request will be send to one back-end, and that back-end will query other back-ends. But in these downstream requests the Authorization header is not forwarded. This will result in a UNAUTHORIZED response.

Is it possible to forward the Authorization header for downstream requests in federated queries? Or maybe all headers?

Note: We haven't actually tested this, we don't have the reverse proxy with authorization checks yet. We are still analyzing what is possible. I have viewed the downstream requests however using Wireshark and was able to verify no headers were forwarded.

BramMeerten commented 2 weeks ago

Slightly off-topic, but some more context:

We also plan to host the qlever UI, we want this to be authenticated as well. If the user is not authenticated, they will be redirected to a login page, and redirected back to the UI. It would be useful as well if the UI also forwards the authentication header in the requests to the servers.

But we have a workaround for this: The JWT token is wrapped in a cookie. The browser will also include this cookie in the requests to the server. The calls to the server are intercepted, and the presence of a JWT token (in authorization header or in cookie) is checked, and the token is validated.

Stiksels commented 2 weeks ago

I was doing some research on this and found that it is not an easy problem...

eg: https://github.com/w3c/sparql-dev/issues/117

Take for example the query below:

# send to http://server-0/sparql
SELECT * WHERE {
   [...]
   SERVICE <http://server-1/sparql> {
       [...]
   }
   SERVICE <http://server-2/sparql> {
       [...]
   }
}

scenario 1: server 0-2 are all part of the same internal system

scenario 2: server 0-1 are part of an internal system, server 2 is external

scenario 3: server 0-1 are external, server 2 is part of the internal system