apache / jena

Apache Jena
https://jena.apache.org/
Apache License 2.0
1.1k stars 647 forks source link

Add LIKE, which uses glob matching instead of regex. #1666

Open afs opened 1 year ago

afs commented 1 year ago

Version

4.6.1

Feature

LIKE provides glob matching of strings.

It is simpler; regular expressions with escaping and meta characters can be offputting.

LIKE("ABCDE...XYZ", "AB*Z") is true.

There are two cases of glob string matching.

Most SQL system have LIKE and use _ and % - and these provide character ranges [abcde] or [a-z], negated ranges [^... ] but there is also filename matching, and programming libraries, using * and ? and sometimes support of character ranges.

Glob matching is anchored.

A flag, c.f. REGEX would be allowed; there is only one valid flag,"i" (case insensitive).

The old Apache ORO codebase has a "glob to regex" converted.

There are java code libraries for just * and ? (e.g. Apache Commons-IO FilenameUtils.wildcardMatch).

Are you interested in contributing a solution yourself?

Yes

namedgraph commented 1 year ago

Support for the q flag in regex would be useful https://www.w3.org/TR/xpath-functions-31/#flags

afs commented 1 year ago

Reason being?

Do you know of any glob matchers that support "q"?

namedgraph commented 1 year ago

I guess it should be a separate issue, but q addresses the use case where the string is passed as-is and the client does not have to do any escaping. Update: created #1667.