Access control - Githubissues

akuckartz commented 4 years ago

Currently SPARQL seems to presuppose that all data is and should be accessible by everybody. But this is not true. Some kind of support for access control (such as RBAC) should be made available for SPARQL.

Maybe relevant:

Luca Costabello, Serena Villata, Nicolas Delaforge, Fabien Gandon. Ubiquitous Access Control for SPARQL Endpoints: Lessons Learned and Future Challenges. WWW - 21th World Wide Web Conference - 2012, Apr 2012, Lyon, France. 2012. ffhal-00691271f https://hal.inria.fr/hal-00691271/file/p487.pdf also: http://www-sop.inria.fr/members/Serena.Villata/Resources/www12.pdf

Access Control, Triggers and Versioning over SPARQL Endpoint September 2014Communications in Computer and Information Science 468:67-75 DOI: 10.1007/978-3-319-11716-4_6 https://www.researchgate.net/publication/289642863_Access_Control_Triggers_and_Versioning_over_SPARQL_Endpoint

namedgraph commented 4 years ago

Why? Specifications should be orthogonal.

There's the W3C ACL ontology by the way.

akuckartz commented 4 years ago

@namedgraph

Specifications should be orthogonal

Absolutely. Do you think that the W3C ACL ontology solves this issue?

namedgraph commented 4 years ago

At the Linked Data layer -- yes. Not at the triple layer.

An example of a query that checks ACL: https://github.com/AtomGraph/LinkedDataHub/blob/master/src/main/webapp/WEB-INF/web.xml#L17

ericprud commented 4 years ago

I tend to agree that ACLs should outside the semantics of SPARQL but here's a devil's advocate argument:

SERVICE has some implied protocol interaction so you could argue that you'd want to add access tokens to it. We could pepper graphs and even triples with access tokens to choreograph the data access. One downside with this is that it pastes secrets into what should otherwise be a shareable query.

On the enforcement side, you could use SPARQL as a policy/ACLs language to specify who can see what. These can be "executed" to produce modified queries like @namedgraph's example above. I believe the cited papers are implementations of ACLs using SPARQL (and imply no changes to SPARQL itself).

I played around with using SPARQL rule flattening to inject locally-maintained ACL constraints to create self-permissioned queries (example product) but the tabular structure of SPARQL made it hard to do in more complex cases. (I later remedied that coercion of trees into tables with ShEx-on-ShEx, but that's an alternative to using SPARQL for policies.)

afs commented 4 years ago

See #117.

lisp commented 4 years ago

I tend to agree that ACLs should outside the semantics of SPARQL but here's a devil's advocate argument

as do we. dydra is a multi-tenant service, in which accounts, repositories and views are addressable resources. it includes a resource-based access control mechanism to govern access to them. we stop at the dataset level, even though the resource identifier space has capacity to comprehend both graphs and quads within a dataset. the constraints are recorded in a user-level repository and the logic is expressed as user-definable sparql queries, much in the manner of the costabello paper, with a vocabulary based on the w3c acl model, similar to @namedgraph's example.

while, at the repository level, the mechanism does not belong in the language, it could have a place in the sparql protocol recommendation. were it to apply to how the "rdf dataset" is determined - and thereby to graphs, it would have to find a place in the language document.

VladimirAlexiev commented 4 years ago

In case we want to protect individual triples, inference complicates things (or said in the opposite way, inference is complicated by triple level ACL requirements)

lisp commented 3 years ago

In case we want to protect individual triples, inference complicates things

there have been papers which indicate plausible approaches to this where the inference mechanism involves query rewriting, but we would need a really good reason before we try to find out how well they work in practice.

VladimirAlexiev commented 3 years ago

I also believe that the best way to implement Access Control is by query rewriting, i.e. injecting ACL clauses into the user query. Otherwise you get big difficulties.

TallTed commented 3 years ago

It's worth noting that access control, per se, is not built into the query arena of SQL, but there are robust solutions down to the record/row level, and even to the cell level. It's not clear to me that this needs to be built into SPARQL, either.

SPARQL can easily borrow authentication from the HTTP layer, and the back-end can apply it as it likes. Virtuoso does this, with most deployments applying it at graph level, but some deployments have gone to triple level, or with extrapolation to "all triples that include entity ex:5432". (The latter two require the features of the latest Enterprise Edition v8.3.x; they are not, and are not currently planned to be, made available in the Open Source Edition.)

Virtuoso also allows for certificate-based authentication, which does not impact SPARQL at all, per se.

SPARQL-FED, on the other hand (so far as I recall), does not have any way to handle authentication other than plain-text injection of an authn component into the URI of a SERVICE clause, which is far from secure in itself. Consideration of how to relay authn from the initial SPARQL endpoint to lower-level SERVICE endpoints would be worthwhile. Some of the work in the Solid sphere is likely reusable here.

w3c / sparql-dev

Access control #121