sweetrdf / rdfInterface

MIT License
6 stars 2 forks source link

Relax QuadTemplate API so it accepts any Term as predicate and graph #8

Closed zozlak closed 1 year ago

zozlak commented 3 years ago

Various TermTemplates can be implemented in the same way we have an explicit QuadTemplate in this interface . Let's say LiteralTemplate allowing to get a literal with a given language but without specifying its value. While I think it's not worth to try to include such classes in the rdfInterface (as in general there might be countless number of them and we already have a common hat for them - the Term class), it would be usfeull to allow the QuadTemplate to use them also for predicate and graph.

@k00ni what do you think?

k00ni commented 3 years ago

I have to think about that a little more. For now I am concerned about making interfaces to "liberal", in a sense that we allow RDF structures which are not RDF-conform anymore. When serializing data using this relaxed approach, it might result in data which is invalid RDF (e.g. predicate is a literal). There are new developments like RDF*, sure, but we should adapt the library after such entities got released, not upfront. In my opinion RDF's base structure like triples and datasets should be kept as it is in the specification. It keeps things compatible to each other. Most value I see in application built upon.

zozlak commented 3 years ago

Please note it's about QuadTemplate and not Quad. According to the rdfInterface uou can't serialize QuadTemplate. QuadTemplate is a data structure used to match quads you are interested in.

Let's say you want to find all triples having a given subject, predicate and literal value in a given language (pretty common use case I would assume). In the current API you must do it as follows:

foreach($dataset->copy(new QuadTemplate(new NamedNode('mySubject'), new NamedNode('myPredicate'), null)) as $quad) {
    if ($quad instance of Literal && $quad->getLang() == 'myLang') {
      ...do something...
      break;
   }
}

But by allowing QuadTemplate to take any Term as an object you can do:

class LiteralLangTmpl implements Term {
    private string $lang;
    public function __construct(string $lang) {
        $this->lang = $lang;
    }
    public function equals(Term $term): bool {
        return $term->getType() == TYPE_LITERAL && $term->getLang() == $this->lang;
    }
}

$dataset->copy(new QuadTemplate(new NamedNode('mySubject'), new NamedNode('myPredicate'), newLiteralLangTmpl('myLang')));

Moreover, the LiteralLangTmpl class can be easily moved to a separate library and will work with any rdfInterface implementation. Which I find very elegant and flexible (and I'm planning to write a library with set of common "Term templates" like a one for literals or one for "any named node among given ones" or maybe even something for dealing with SPARQL 1.1 path-like expressions).

And when it comes to serializers - they must anyway check if they can deal with what they get. By the way it affects not only serializers and can affect also the most ordinary Term types - see e.g. https://github.com/sweetrdf/quickRdf/issues/1 .

k00ni commented 3 years ago

I didn't use things like that often yet, so my feedback is more theoretical.

In this case your argument makes sense, because people will "configure" a template by themselves and therefore have to know what they do. I have no counter argument here.

This approach with the foreach loop is interesting, because it reminds me on another approach to fetch triples besides SPARQL. You use some kind of an API to define S, P or O. It could be an interesting approach in combination with a store. Anyway, too off topic I guess.

zozlak commented 3 years ago

This approach with the foreach loop is interesting, because it reminds me on another approach to fetch triples besides SPARQL. You use some kind of an API to define S, P or O. It could be an interesting approach in combination with a store. Anyway, too off topic I guess.

Yes, a basic SPARQL set of triple statements like

s1 p1 o1 .
s2 p2 o2 .
...
sN pN oN .

can be easily translated into the rdfInterface API as:

$d->copy(new QuadTemplate($s1, $p1, $o1))->
    intersect($d->copy(new QuadTemplate($s2, $p2, $o2))->
    ...
    intersect($d->copy(new QuadTemplate($sN, $pN, $oN))

(where $xN is null when it's a variable in SPARQL and a corresponding object if it's a literal/URI in the SPARQL)

In general implementing a SPARQL 1.0 on top of the rdfInterface implementation shouldn't be too hard (once a SPARQL query is already parsed into an expression tree :-) ) as executing SPARQL 1.0 goes down to filtering on a QuadTemplate-like predicates and making set operations (intersection, union, difference) on the filtering results and rdfInterface has API for all of that.

By the way:

zozlak commented 3 years ago

The 0.8.0 release dropped the QuadTemplate in favor of TermCompare and QuadCompare extends TermCompare (see the 0.8.0 release notes).

All in all the current approach seems to work well. Meaning changes introduced by 0.8.0 has been driven by development of termTemplates and allowed pretty nice (IMHO) term templates to be developed.

I also didn't hit any issues connected with the current approach while coding examples for the introduction for EasyRdf users.