jmespath-community / jmespath.spec

JMESPath Specification
6 stars 3 forks source link

Consider supporting regular expressions #15

Closed springcomp closed 2 years ago

springcomp commented 2 years ago

There seems to be some demand for supporting regular expressions.

This presents some challenges as there are many different dialects of regexp, although there is an effort around specifying an interoperable regex format.

Also, regular expressions uses cases fall into three main categories:

So a JEP would need to come up with intuitive syntax for those use cases that would be relevant.

The interoperable regex format proposal referred to above only supports the first scenario. So maybe that’s the only scenario that requires extending the JMESPath syntax.

The last two scenarios may be supported using dedicated functions.

springcomp commented 2 years ago

Tu use regular expressions in a filter-expression we would need:

boolean regex_match(string $text, regexp $regex)

regexp could be a string or could be a new datatype that could only be used in functions.

regexp          = …
function-arg =/ regexp

Example

Consider the following JSON document

{
    "assets": [
        { "name": "first-name" },
        { "name": "last-name" },
        { "name": "other" }
    ]
}

To select all assets whose name matches a given pattern, the following hypothetical JMESPath expression could be used:

 ``` assets[? regex_match(name, /-name$/i) ]


This would evaluate to:

[ { "name": "first-name"}, {"name": "last-name"} ]

springcomp commented 2 years ago

To support regular expressions to replace substring, we would need a new function:

string regex_replace(string $text, regexp $regex)

For instance, using the same example as the previous comment, the following expression:

assets.{raw: regex_replace(name, /\-name$/g, '')}

would evaluate to:

[
  { "raw": "first" },
  { "raw": "last" },
  { "raw": "other" }
]

Maybe something like:

springcomp commented 2 years ago

Finally, a function could return the named or numbered captures from a regex.

object regex_capture(string $text, regexp $regex)

Example:

The simplest usage could look like:

regex_capure('hello, world!', /[a-z]+/)

Which evaluates to a JSON object if the match is successful:

{ "0": "hello" }

Adding capturing groups would be supported like so:

regex_capure('hello, world!', /([a-z]+).*/)

Would return:

{
  "0": "hello, world!",
  "1": "hello"
}

Likewise, this expression:

regex_capture('hello, world!', /(?<first>[a-z]+).*?(?<second>[a-z]+).*$/)

Would evaluate to:

{
   "0": "hello, world!",
   "first": "hello",
   "second": "world"
}

This is currently not very useful, but in light of the probably upcoming function let, we could have expressions like so:

let( { re: regex_capture(foo, /^([^\-]+)?[a-z]+$/) }, &\<expression-type> )
springcomp commented 2 years ago

jmespath/jmespath.py#23