databricks / sjsonnet

Apache License 2.0
267 stars 55 forks source link

[ES-449270] Fix duplicate key error in object comprehension #156

Closed ckolb-db closed 1 year ago

ckolb-db commented 2 years ago

Fixes #121

Please advise on checking for size twice vs calling contains once per key, I haven't benchmarked this.

This might also break some existing files which rely on this behavior without realizing it's out of spec, either with duplicate list entries getting reduced or more complex replacements where only the last entry matters. Not sure if we're ok with that or what other options we have. Breaking these probably beats allowing duplicate entries in the future.

ckolb-db commented 2 years ago

It's not an object as input, but rather a list of arbitrary values/objects from which the object entry can be constructed yet again arbitrarily. The first uniqueness check in the system is the one that was missing here.

The test would fail so far as it would not throw an error and rather return an object with only the last mention of the key in it.

This surfaced again due to an issue in our teams.jsonnet file where I think a team name was duplicated in two unrelated entries.

I'm checking the map size rather than map.contains before inserting as an optimization, since contains is only armortized O(1) and contains the string length comparison.