SmartDataAnalytics / RdfProcessingToolkit

Command line interface based RDF processing toolkit to run sequences of SPARQL statements ad-hoc on RDF datasets, streams of bindings and streams of named graphs with support for processing JSON, CSV and XML using function extensions
https://smartdataanalytics.github.io/RdfProcessingToolkit/
Other
39 stars 3 forks source link

SPARQL IRI function does not work #28

Closed TBoonX closed 1 year ago

TBoonX commented 2 years ago

Running the jena-4.6.0 branch rpt, the IRI function does not return something.

IRI("http://example/") does not create an IRI.

Example:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

construct {
  <http://example.com/12>
    rdfs:seeAlso ?a .
}
where {
  bind(IRI("http://example/111") as ?a)
}

There is no error.

Aklakan commented 2 years ago

The problem was introduced recently with https://github.com/apache/jena/issues/1272 and I yet need to forward it to Jena in order to find a solution that would allow for relative base IRIs.

RPT used to use an empty base URL (in conjunction with a reflection hack to inject a custom IRIxResolver) to retain relative IRIs - however the fix for the issue above now rejects ANY relative base IRI in the iri function. Arguably, the IRI function should just use what's given rather then validating it.

Workarounds

SimonBin commented 1 year ago

this is fixed now right?

Aklakan commented 1 year ago

Yes, its fixed. RPT no longer uses an empty base URL hack but instead by default uses an absolute file URL to the current working directory.

/tmp$ rpt integrate 'IRI("http://example/") {}' --out-format csv
# file:///tmp/example/

The rationale for this is to allow for out-of-the-box simple relative references to files such as <myfolder/data.csv> csv:parse ?row.

The recommendation is to use iri:asGiven explicitly when needed. Also, with ExprTransformIriToIriAsGiven we have a transformer that can replace all IRI expressions with iri:asGiven when desired. This way we can transparently change the semantics of IRI to (a) allow for relative IRIs without having to alter Jena ARQ's standard SPARQL machinery and (b) avoid the expensive IRI validation.