puppetlabs / puppetdb

Centralized Puppet Storage
http://docs.puppetlabs.com/puppetdb
Apache License 2.0
299 stars 225 forks source link

Negated regex on dotted path not the negation of a regex match #3924

Closed austb closed 6 months ago

austb commented 9 months ago

Consider 3 nodes with a single resource, their resource params for that resource might look like

# node 1
{"simple":"param"
 "nested" {"foo": "bar"}}

# node 2
{"simple":"foo"
 "nested" {"foo": "bar"}}

# node 3
{"nested" {"foo": "bar"}}

A query using regex on the parameter value would look like

# PQL
resources[] { parameters.simple ~ "par.m"}

# AST
["from", "resources", ["~" ["parameter", "simple"] "par.m"]]

and it would return the resource for node 1 which is the only matching resource parameter.

Confusingly, the negation of this query does not return node 2 & node 3. It will only return node 2

# PQL
resources[] { parameters.simple !~ "par.m"}

# AST
["from", "resources", ["not", ["~" ["parameter", "simple"] "par.m"]]]

This is because a regex match on a jsonb value expands into 2 clauses in a HoneySQL raw element (a string of SQL), a check that the regex matches and a check for key existence. When wrapped in a HoneySQL :not it does not wrap the internal clause in parentheses, and the check is that the regex does not match, but the key does exist.

NOT (parameters ->> 'simple') ~ ('"par.m"'::jsonb#>>'{}') AND parameters ? 'simple'

If the internal clauses were honeysql rather than raw SQL it would quote them, and the behavior does not line up to the behavior of = and != on jsonb fields, which are proper boolean negations of each other.

Internal issue: https://perforce.atlassian.net/browse/PDB-5713