Open samer1977 opened 1 month ago
I asked this question above but no answer which Im not sure why. If you come from Jolt background you have used this function and though it doenst work perfectly it helps sometimes and its good to have as an option. I started learning JSLT couple of days ago and it caught my interest. I can see cases where jstl can be better option than jolt and might simplify things. Performance I'm not sure though, I made comparison using Nifi and ran both spec on the same input to produce the same output and jolt always had the a little bit of edge. Regarding the above question here is what I was able to come up with and I hope I was successful:
def squashNullsRecursive(obj)
let simple = { for($obj) .key: .value if (.value!=null and not(is-object(.value)))}
let complex = { for($obj) .key: squashNullsRecursive(.value) if (is-object(.value)) }
let array = { for($obj) .key: [for(.value) . if (not(is-object(.)) and .!=null)] +
[for(.value) squashNullsRecursive(.) if (is-object(.) or is-array(.))]
if (is-array(.value))
}
$array +$complex+$simple
Input:
{
"x": "x1",
"y": "y2",
"z": {
"z1": "z11",
"z2": null,
"z3": [
1,
{
"zzz": "skid",
"zzz1": null
},
2
]
}
}
squashNullsRecursive(.)
I didn't answer because I don't have time to write this function from scratch.
You're on the right trick, but in the top level of your function I'd use if
and test the input for is-array
and is-object
to separate the cases: object, array, something else. You can write it much more simply and cleanly that way.
Can you please give an example for the simplification. Im not sure what you mean by if and test. Thanks
You know what an if
statement is, right? What's inside the ()
is the test.
Sorry I still dont get it. I thought Im using if statement with For loop and I thought this is the clean way per documentation. I know what if statement is. I might be slow and not as smart as you are but I know I can write better flatten-object than yours ;)
This one works
def squashNulls (obj)
from-json (replace (replace (replace (replace (string ($obj), "\\\"[^\"]+\\\":null", ""), ",,", ","), ",}", "}"), ",]", "]"))
squashNulls (.)
It could be reduced to a simpler replace
, if that function supported positional patterns.
The last two replacement patterns can be collapsed into "," followed by either }
or ]
to
replace (s, ",([}]])", "$1") or
replace (s, ",([}]])", "\1") or
replace (s, ",([}]])", "&1") or
Or whichever mechanism there is.
What is used underneath replace
, is it plain Java ?
It works on RegexPlanet, see https://www.regexplanet.com/share/index.html?share=yyyyf6v7w2d Click on 'Java'.
I am aware this is not what @samer1977 asked for.
I checked, see https://github.com/schibsted/jslt/blob/master/core/src/main/java/com/schibsted/spt/data/jslt/impl/BuiltinFunctions.java#L931
Java Regexp Pattern are used internally, but they do not support positional patterns. I'll open an issue for that.
string replace? that looks scary from performance perspective but I guess I need to do some testing and find out
My original algorithm did not support an initial property of an object being null. It only worked if the null property was in the middle or the end.
string replace? that looks scary from performance perspective but I guess I need to do some testing and find out I also got rid of two nested
replace
calls, from 4 calls to 2.
Better performance, right ?
This one does now support initial nulls in objects:
def squashNulls (obj)
from-json (
replace (
replace (
string ($obj),
",?\\\"[^\"]+\\\":null",
""
),
"\\{,",
"{"
)
)
squashNulls (.)
Tested on this input:
{
"w": null,
"x": "x1",
"y": "y2",
"z": {
"z1": "z11",
"z2": null,
"z3": [
1,
{
"zzz": "skid",
"zzz1": null
},
2
]
}
}
@samer1977 By the way, what is the policy on null values in arrays ?
[ null, 2, 7, { "a": 1, "b": 2 }]
Should that null be dropped ? The size of the array would change.
My algorithm only drops attributes that are null.
They do not change the objects, i.e. I consider the objects { "a": null, "b": 7 }
and { "b": 7 }
structurally equivalent.
Hi,
Im coming from jolt background and now finding myself to learn jslt because apache nifi introduced new json transformation using jslt and I'm interested in learning to see if I can get the best of both world. Its totally different mind set but I can see how close its to Xquery in xml. I'm surprised that no one has asked this because this is common problem in json transformation when you want to get rid of all null values. Jolt has created function called recursivelySquashNulls that will remove all nulls in nested json recursively but I could not find something similar in jslt. Can someone please write me the spec for it in jslt? I spent the whole day trying to figure it out but its not that easy specially when your nested object is either complex object or array of complex object or even array of simple types. I would like to see if jslt can address all scenarios in not so much convoluted spec.
Thanks