Open dhduvall opened 1 year ago
I don't quite understand your question and what exactly do you mean by ambiguity, but dyff applies regexps to a go-patch path string. At least i think so
https://github.com/homeport/dyff/blob/86654f8e002a69fdabc87ac954f7d0fa11e6b84b/pkg/dyff/reports.go#L94
(Path.String() returns the string with go-patch style separators)
So in your example the correct regexp would be ^/metadata/.+/generation$
The exact behavior should probably be documented somewhere. Unless I'm missing something you literally can't use this feature correctly without perusing the source code
So in your example the correct regexp would be ^/metadata/.+/generation$
Yep, that'll work; thanks for helping get me to that point. It should be good enough for my use, since I'm not expecting to run into any ambiguous paths.
The exact behavior should probably be documented somewhere. Unless I'm missing something you literally can't use this feature correctly without perusing the source code
Agreed.
The ambiguity I was talking about is when you have a path which is collapsed into a string with some separator (dots, slashes, it doesn't matter; I spend more time with jq
and yq
, so I'm used to dots as separators, but also dyff
output uses dots by default, so I'd expect that as input by default, too) and a single key that is the same as that path.
Here's a concrete example:
metadata:
annotations:
prometheus.io/scrape: true
prometheus.io:
scrape: false
(Ignoring the fact that this isn't a valid Kubernetes manifest.) Does metadata/annotations/prometheus.io/scrape
, reference the true
value or the false
value? It looks like it references both (pardon the zsh
-ism):
dyff between --exclude-regexp 'annotations/prometheus.io/scrape' \
=(jq -n '.annotations["prometheus.io/scrape"] |= true | .annotations["prometheus.io"].scrape |= true') \
=(jq -n '.annotations["prometheus.io/scrape"] |= false | .annotations["prometheus.io"].scrape |= false')
returns no differences. I think I would expect the expression to refer to the false
value, but then there needs to be an escaping mechanism for the slash in order to get to the true
value.
I had trouble figuring out what "go-patch" means, though. Google got me to RFC6902 (JSON Patch), which relies on RFC6901 (JSON Path), which uses ~1
to represent /
, but I tried annotations/prometheus.io~1scrape
, and dyff
reported both differences.
It looks like dyff
uses ytbx
to do this processing, and that has no reference to ~
in its source, and doesn't even pretend to implement RFC6901 or RFC6902. This is a touch surprising, since the README explicitly references https://github.com/cppforlife/go-patch (aha, there's "go-patch"), but then never actually uses it. go-patch
does implement the RFC6901 escaping mechanism, though it's careful to tell us that it's only "roughly based on" RFC6902. The README does say "output", and not "input", but since there's no mention of input format, maybe someone who read the README would assume it was the same as output?
This bug has morphed a bit. I had originally tried to do
--exclude-regexp metadata\.annotations\.checksum/.*
, mistakenly thinking that the dots were part of a string that needed to be escaped in order not to match too many strings, and not the separators between path segments.But that raises the question of how those two syntaxes interact. For instance, if I want to exclude any item whose path starts with
metadata
and ends withgeneration
, how do I express that? It seems thatmetadata.*generation
works, but possibly by accident? At least, I wouldn't know how to filterdistinctly from
or
There are a similar set of ambiguous patterns using
/
as the separator, too.In tracking this down, I noticed that the tests are also missing examples which contain regular expression metacharacters (not that they would have caught this particular problem).
Ambiguity is probably fine 99% of the time, but even having that much documented would be really handy. Unfortunately, that seems like it's getting into manpage territory ....
I'll note with a crazy look in my eye that
helm
uses (AFAIK completely undocumented) vertical tab (\v
) as a separator internally, so if you have, say, adatadog
repo configured, which hasdatadog
,datadog-crds
, anddatadog-operator
charts, you can search for\vdatadog/datadog\v
to find just the first. Perhaps it might make sense to treat\v
as a segment separator here, too, assuming it's not possible to use that character in a YAML key.