jimhigson / oboe.js

A streaming approach to JSON. Oboe.js speeds up web applications by providing parsed objects before the response completes.
http://jimhigson.github.io/oboe.js-website/index.html
Other
4.79k stars 208 forks source link

Order of node execution unclear #202

Open n4zukker opened 4 years ago

n4zukker commented 4 years ago
  1. The oboe documentation should specify the order which matching node callbacks are executed. I think that the events are based on javascript object property order (https://stackoverflow.com/questions/5525795/does-javascript-guarantee-object-property-order) by path and then executed in reverse. This is counter-intuitive to javascript programmers who are familiar with Events (https://nodejs.org/api/events.html#events_emitter_emit_eventname_args). So it should be made explicit.

  2. Order is critical when their are multiple matching paths registered because:

    If the callback returns any value, the returned value will be used to replace the parsed JSON node.

But only the value returned from the last callback is used as the replacement. Values returned from previous callbacks are discarded. I wish that the previous callback's replacement value was passed as an argument to the next callback. And that we had better control of the ordering.

I would like to use oboe to redact passwords from JSON and also to trim any strings that are longer than, say, 100 characters. I tried:

   oboe.node( { 'password': () => 'XXX' } )
   oboe.node( { '*': trimString } )

where trimString is a function that returns the first few characters when given a string, or the object passed in otherwise.

But, '*' always gets executed last and so any password redactions are discarded.

  1. This issue could be worked around if we could match on node type (string, object, array). This JSONPath: https://goessner.net/articles/JsonPath/ allows for ?(typeof @ === "string") but that is not supported in oboe's JSONPath.

  2. Or if we could short-circuit the callback execution, that would also be a work-around. So that after a match on password, oboe would not proceed to match the star.

  3. It would also be nice to pipeline two instances of oboe together. I think this can only be done at the moment by serializing and re-parsing in the middle. But then I might use one instance to redact passwords and the pipe the result to the string trimming instance.


Here's the test case I'm using to explore the path execution order:

function fnCallback(n) {
  return function (node) {
    if (Array.isArray(node)) {
      node.push(n)
    }
    return node
  }
}

oboe.node({'*': fnCallback('1. match star')})
oboe.node({'x': fnCallback('2. match x')})
oboe.node({'x': fnCallback('3. match x')})
oboe.node({'x': fnCallback('4. match x')})
oboe.node({'*': fnCallback('5. match star')})
oboe.node({'!*': fnCallback('6. match bang star')})
oboe.node({'!x': fnCallback('7. match bang x')})
oboe.node({'!.x': fnCallback('8. match bang dot x')})

I run that against {x:[]} and the output is:

{ x:
   [ '8. match bang dot x',
     '7. match bang x',
     '6. match bang star',
     '4. match x',
     '3. match x',
     '2. match x',
     '5. match star',
     '1. match star' ] }
n4zukker commented 4 years ago

For item 5,

It would also be nice to pipeline two instances of oboe together. this is probably not doable since a node event could completely transform and void everything that happened before. For instance, with this example, if console.log was a return instead, the array of nouns and verbs would be transformed into an array of strings.


[
{"verb":"VISIT", "noun":"SHOPS"},
{"verb":"FIND", "noun":"WINE"},
{"verb":"MAKE", "noun":"PIZZA"}
]

oboe('words.json') .node('verb', toLower) .node('noun', toLower) .node('!.*', function(pair){ console.log('Please', pair.verb, 'me some', pair.noun); });


you wouldn't know that until the `!.*` pattern had finished.  So you really have to wait until all the root node events have completed before you know what the output begins with.

Plus oboe transforms text to javascript.  Not text to JSON.  A callback could return an object or function that has a wacky `toJSON()` and so what looks like an object or array in javascript really turns out to be a string in `toJSON()`.