Closed GoogleCodeExporter closed 9 years ago
Examples of the use of void:uriRegexPattern from Keith:
http://tinyurl.com/yd82v5b
http://tinyurl.com/ybpwuhj
Original comment by richard....@gmail.com
on 29 Oct 2010 at 10:14
Original comment by Michael.Hausenblas
on 29 Oct 2010 at 10:15
I'm not convinced it is a "sweeter" spot - our void:uriRegexPattern is easy to
use in a SPARQL query for selecting datasets containing a URI. I am concerned
that adding an alternative either in tandem or as a replacement, would
complicate this.
Original comment by K.J.W.Al...@gmail.com
on 29 Oct 2010 at 10:40
@Keith: Pretending that a regular URI is a regex will actually frequently have
the desired result.
Original comment by richard....@gmail.com
on 29 Oct 2010 at 10:53
Yeah - but we did do some discussion and work on defining uriRegexPattern
better already for the next release because of the edge cases where it wouldn't
have the desired result (eg: http://a.c.com also matches http://abc.com ?)
So I see how just giving a uri prefix is a little simpler to write, but I
don't see the scenario in which it is simpler to use.
It would be useful if someone could write out the rationale for introducing the
new property here.
Original comment by K.J.W.Al...@gmail.com
on 29 Oct 2010 at 11:11
SPARQL has regexes but no substring/contains/startsWith. That's a bizarre
accident of history. If you are in any other environment, a substring match is
easier and less error prone than a regex match. In SPARQL, a substring match is
*also* easier (just use the prefix URI as a regex), but *more* error-prone
because of the issues we discussed earlier.
Serious SPARQL implementations increasingly tend to come with string functions
as well:
http://spreadsheets.google.com/pub?key=tl2FDWghDKDc3G70xKkNoNg&output=html
And unlike the REGEX function which invariably performs poorly, a startsWith
function can actually be optimized by the triple store using an ordered index.
With prefix strings, it is possible to analyze a collection of void:Datasets
for overlap or containment. This isn't easily possible with regexes.
Original comment by richard....@gmail.com
on 29 Oct 2010 at 5:50
I emailed some voiD users who have used non-trivial regexes in their voiD data.
From Toby Inkster:
> Regex patterns seem like they would remain useful, especially for
> dealing with subsets of a dataset. e.g. saying that the subset matching
>
> http://example\.com/(.+)\.ttl
>
> is available in Turtle format.
From Leigh Dodds:
> I'm tending towards using simple prefixes (and void sub-sets)
> to define a URI space.
>
> The regex patterns have been useful in writing display code as its
> easy to find whether
> a particular URI matches a space. This is obviously still possible
> with a prefix approach.
>
> I think everything I've currently done with regex's could be handled with a
> prefix (or set of prefixes).
>
> Regex's could be useful if you wanted to define, in more detail, what the
> exact structure of a specific URI space might be, e.g. is the prefix followed
> by only letters, or numbers, or whatever.
>
> An additional feature to consider would be use of URI templates to allow
> URI construction. But there you need more than prefix/regex.
In summary, they are not opposed to a void:uriSpace property, but see the
usefulness of void:uriRegexPattern, or perhaps even of more complex approaches
that use URI templates.
Original comment by richard....@gmail.com
on 8 Dec 2010 at 10:30
I've gone ahead and added void:uriSpace to Section 4.2, in r169.
Original comment by richard....@gmail.com
on 10 Dec 2010 at 2:01
Resolved to close it in today's teleconference. See Issue 91 for followup
Original comment by richard....@gmail.com
on 14 Dec 2010 at 11:54
Original issue reported on code.google.com by
richard....@gmail.com
on 21 Oct 2010 at 4:51