marshallford commented 4 years ago

Expected Behavior

While writing a rule for preventing overlapping hostnames I reached for regex.globs_match in the hopes that the function would be able to parse strings that may contain a wildcard in the left most component. Examples: a.com, *.a.b.c.com, and *. Is it possible for OPA to support custom parsing for this sort of use case? I'm writing a rule for Istio Gateway resources in k8s but I would guess there are plenty of situations where determining intersections in strings with wildcards might be handy.

Thanks.

should_return_true := regex.globs_match("*.foo.com", "bar.foo.com")

Actual Behavior

library/uniquegatewayhost/src.rego:15: eval_builtin_error: regex.globs_match: input:*.foo.com, pos:1, flag '*' must be preceded by a non-flag: the input provided is invalid

Steps to Reproduce the Problem

source:

package k8suniquegatewayhost

identical(obj, review) {
  obj.metadata.namespace == review.object.metadata.namespace
  obj.metadata.name == review.object.metadata.name
}

violation[{"msg": msg}] {
  input.review.kind.kind == "Gateway"
  input.review.kind.group == "networking.istio.io"
  host := input.review.object.spec.servers[_].hosts[_]
  port := input.review.object.spec.servers[_].port.number
  other := data.inventory.namespace[ns][othergroupversion]["Gateway"][name]
  re_match("^networking.istio.io/.+$", othergroupversion)
  # other.spec.servers[_].hosts[_] == host
  regex.globs_match(other.spec.servers[_].hosts[_], host) <--- I wish this had flexible parsing
  other.spec.servers[_].port.number == port
  not identical(other, input.review)
  msg := sprintf("gateway host conflicts with an existing gateway <%v>", [host])
}

OPA version

❯ opa version
Version: 0.20.5
Build Commit: 64dd76e1
Build Timestamp: 2020-06-01T18:35:14Z
Build Hostname: 8f7822bb4c39

Additional Info

Error discovered while writing tests.

opa test -v **/*.rego

tsandall commented 4 years ago

@marshallford as you discovered regex.globs_match is for testing if two REs intersect...which is close but not quite what you want. The naming is a bit suboptimal.

We already have a glob.match function that implements actual glob matching (which is much simpler than full-blown RE matching.) I could imagine adding a glob.intersects (name TBD) that implements what you described. To be consistent with the glob.match function, the glob.intersects function should take two sets of delimiters--something like this:

glob.intersects(str1, delims1, str2, delims2)

Some care would need to be taken to define how empty versus non-empty cases are handled, performance, etc. I don't think the glob patterns are likely to be large or contain many wildcards so I would make the implementation as simple as possible. Along these lines, character globing and super globs (e.g., "foo.**.qux" matches "foo.bar.baz.qux") could be added later.

tsandall commented 3 years ago

Whoops, I think that commit referenced the wrong issue number...

stale[bot] commented 2 years ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.