schibsted / jslt

JSON query and transformation language
Apache License 2.0
638 stars 120 forks source link

For loop expression index / context? #127

Closed vvidovic closed 4 years ago

vvidovic commented 4 years ago

Hi, I was wondering if it is possible to get the index of the current item when doing some transformation inside a "for" expression?

For example for the input:

{ "rows" : [
{"name":"test"},
{"name":"test 2"}
]}

I would like to add row number entries like:

{ "rows" : [
{"name" : "test", "rowNo" : 1},
{"name" : "test 2", "rowNo" : 2}
]}

I was able to achieve this using functions like this (functions are borrowed from issue #122):

let rows = .rows

def index-of(array, value)
  find-index-of($array, $value, 0)

def find-index-of(array, value, index)
  if ($index >= size($array))
    -1
  else if ($array[$index] == $value)
    $index
  else
    find-index-of($array, $value, $index + 1)

{ "rows" : [ for (.rows) {
let rn = index-of($rows, .) + 1
"name":.name,
"rowNo": $rn
}]}

However, I wonder if there is a better way or if some variable (automatically set to proper value) could be added to for expression automatically to achieve same goal with simpler JSTL code, for example:

{ "rows" : [ for (.rows) {
"name":.name,
"rowNo": $forIdx + 1
}]}

Thanks :)

larsga commented 4 years ago

I kind of expected this feature request to show up at some point. I guess there are three main ways this could be done.

Range function as in Python

{ "rows" : [ for (range(size(.rows))) {
    "name" : .rows[.].name,
    "rowNo" : . + 1
}]}

Magic variable

That's basically what you already showed.

Magic function

{ "rows" : [ for (.rows) {
    "name" : .name,
    "rowNo" : current-for-index() + 1
}]}

Opinions welcome.

vvidovic commented 4 years ago

All options seem fine to me.

For range function you would probably have to add a variable set to .rows because the context would be changed inside a for expression. "." would be set to the current index but I guess ".rows" wouldn't then work inside a for expression. Something like:

let r = .rows
{ "rows" : [ for (range(size(.rows))) {
    "name" : $r[.].name,
    "rowNo" : . + 1
}]}

This solution maybe could be the most simple one on implementation side (no magic variable nor magic function) and range() function could be usefull for other cases so I would probably go that way.

larsga commented 4 years ago

Good point about the context being changed to the current array index with range() so that the array becomes inaccessible unless you declare a variable. So that solution then becomes

{ "rows" : [ 
    let rows = .rows
    for (range(size(.rows))) {
    "name" : $rows[.].name,
    "rowNo" : . + 1
}]}

That makes range() not be a very nice solution, IMHO.

And I don't really like the magic variable.

So ... I'm currently in favour of current-for-index().

An issue with all of these is that if you have nested loops the current index in the outer loops become invisible inside the innermost loops. However, I think forcing the developers to declare a variable for the outer indexes is nicer than making the function more complicated so that it can "see" indexes in containing for loops.

alexanderkjeldaas commented 4 years ago

In haskell you'd use a zipWithIndex function, so

{ "rows" : [ 
    let rows = .rows
    for (range(size(.rows))) {
    "name" : $rows[.].name,
    "rowNo" : . + 1
}]}

would be

{ "rows": zipWithIndex(.rows) }

except that it might return a 2-element array or more generic field names.

larsga commented 4 years ago

Yeah, I thought aboout this, too. It seemed a little costly to construct these JSON objects for the traversal, but maybe that's the wrong way to think about it. Let's say zipWithIndex outputs an array of:

[[0, <first array element>],
 [1, <second array element>],
 ...
]

Then the transform would be

{ "rows" : [ for (zip-with-index(.rows)) {
    "name" : .[1].name,
    "rowNo" : .[0] + 1
}]}

It works but doesn't look great. Perhaps better to have an object:

[{"index" : 0, "value" : <first array element>},
 {"index" : 1, "value" : <second array element>},
 ...
]

Then it becomes

{ "rows" : [ for (zip-with-index(.rows)) {
    "name" : .value.name,
    "rowNo" : .index + 1
}]}

Much more readable, for sure. More intuitive, too.

vvidovic commented 4 years ago

The zip-with-index solution looks nice to me too.

larsga commented 4 years ago

Interestingly, SAP's example integration uses zip and range to do exactly this: https://github.com/SAP-samples/cloud-workflow-samples/blob/master/cf-outlook-integration-sample/src/main/resources/com/sap/bpm/wfs/forms/adaptivecards/transform.jslt

Sounds like another vote in favour of either zip+range or zip-with-index

larsga commented 4 years ago

Still thinking about this one. I thought about adding range() instead, but it takes us back to the cryptic .[0] and .[1].

It is possible to do this:

{ "rows" : [ for ( { for (zip(range(size(.rows)), .rows)) {"value" : .[1], "index" : .[0] }) {
    "name" : .value.name,
    "rowNo" : .index + 1
}]}

but nobody's going to like that, I fear.

If we instead supported destructuring assignment we could do:

{ "rows" : [for (zip(range(size(.rows)), .rows)) {
    let [index, value] = .

    "name" : $value.name,
    "rowNo" : $index + 1
}]}

That's not bad ...

And zip-with-index is a very specific function. Although, admittedly, it's also a very common usecase.

🤔

larsga commented 4 years ago

zip-with-index was implemented in commit acb6927, fixing this issue.