Open afs opened 5 years ago
why?
@lisp that is a very common scenario in the real world and right now I have to look up every time how I can do it with regex. I teach SPARQL on a regular base as well, that would definitely facilitate simple string-matches for users.
If this were added, it should be a different name. When explaining SPARQL, one constantly has to talk about matching—and it usually means matching graph patterns against triples. Having a function called “match” that uses the word in a different sense does not help.
Some possible other names:
FILTER wildcard(?title, "*sparql*", "i")
FILTER like(?title, "*sparql*", "i")
This could also be combined with #34:
?doc :title ~"*sparql*"i.
It is called glob
in several other languages.
if the goal is succinctness, it makes sense to go all the way to something like
?doc :title ~"*sparql*"i.
but
Shex has a similar construct called Stem. It works for strings, IRIs (also prefixed) and lang tags
Shex Stem is fn:starts-with / STRSTARTS
.
I agree "match" is already used for graph patterns,. It is also valuable a as a keyword.
LIKE
is good depending on the SQL implications (SQL uses _
and %
for what is commonly *
and ?
in shells and filename matching); SQL LIKE also has character classes and negated character classes.
Filename matching with glob matching, where *
means any character except the component separator, and some systems (e.g. git) add **
to mean "filename, any depth".
Possibilities:
STRMATCH
LIKE
GLOB
WILDCARD
The other choice s what matching language.
SQL LIKE can be rewritten to a regex expression and there are code examples for that online.
In Java, there a few open source direct implementations with *
and ?
, but not [ ]
character classes. (The JDK supports glob on filenames, not directly for strings).
I think being close to SQL is not a bad idea so I like the LIKE
idea. If I would not know about that I would go for STRMATCH
as it resembles other functions in SPARQL but then again it might add to the confusion. LIKE
is unique in that sense.
And while I like the GLOB
idea I have to agree that I mainly know it for file matching.
Regular expressions can be complex. Strings with wildcards are simpler.
Proposed solution
Provide string matching using wildcards as an additional, alternative to regular expressions by adding a new function. The string is anchored.
Examples:
Previous work
Glob patterns SQL LIKE Lucene wildcard searches
Considerations for backward compatibility
None.