sanack / node-jq

Node.js wrapper for jq
MIT License
276 stars 51 forks source link

Empty result is decoded as an empty string even when `output` is set to `json` #689

Open grenik opened 3 months ago

grenik commented 3 months ago

If running a jq filter produces an empty result, the result is an empty string ('') even if the output option is set to json.

It may be a bit confusing because an empty string is also a different valid result (see examples below).

// this filter yields no results
jq.run('select(.foo == "bar")', {foo: ""}, {input: "json", output: "json"}).then(result => {
  // result is an empty string ''
});

// this filter yields an empty string
jq.run('.foo', {foo: ""}, {input: "json", output: "json"}).then(result => {
  // result is an empty string ''
});

I believe that undefined would be a better option because it is used neither in JSON nor in JQ.

davesnx commented 3 months ago

Exactly, the option output:"json" is not guaranteed to be always JSON, it just tries to parse as JSON and in case of failing it keeps it as the stdout (which is a string). There's a few cases of queries where empty string is actually the result of the run.

In case of changing this behaviour to return undefined (only when the parsing fails) it can be miss-leading to assume that undefined could be returned as a operation that expects a JSON as output. Which arguably, is the same situation as the current approach, where we do the minimum to ensure the output is JSON.

I would like to change it, and either reject the promise or radically return an empty object. Any extra thoughts @mackermans?

Thanks for opening the issue @grenik, I believe isn't the first confusing case.

mackermans commented 3 months ago

Thanks for raising this, @grenik.

@davesnx I opened the PR thinking that undefined made the most sense. But now I understand your point 😄 We've made it pretty difficult for ourselves by returning parsed JSON, effectively making it a JS data type, and to make matters worse, falling back to raw JSON in case of failure.

I don't think the alternatives are better, though:

Could we consider a new output mode (JS)? That means we would need to accept that the current case described by @grenik will continue to exist.

davesnx commented 3 months ago

Indeed, the empty string is an expected result from jq saying that there's no filters/matches with your query. Either by outputting json or strings.

grenik commented 3 months ago

Many thanks for quick answers and even a solution!

Totally agree with the following points:

  • Rejecting the promise indicates a parsing failure
  • Returning an empty object breaks with JQ's behavior

 

Option 1: Solution with undefined

As for me, returning undefined is totally fine as I personally treat null and undefined as "special values" for any variable and always expect them for my own safety unless the opposite is specified explicitly (e.g. in JSDoc).

Looking into JS docs I liked the following clause:

A function returns undefined if a value was not returned.

Which also suits to the case when JQ doesn't return any result. In other words, if you return a Prompt fulfilled with undefined, it means there was neither error nor result ;-) But I am a JS newbie so I may be fundamentally wrong in my understanding.

 

Option 2: stream output?

Could we consider a new output mode (JS)?

Another fundamentally correct option could be stream

Pros:

Con:

As a simple example, you will be able to support the following case:

% echo '[0,true,"foo",null,{}]' | jq '.[]'
0
true
"foo"
null
{}

Below is a code snippet:

```js const { parser } = require('stream-json'); const { streamValues } = require('stream-json/streamers/StreamValues'); const { pipeline } = require('node:stream/promises'); const { Readable, Transform } = require("stream") const result = []; const through = (transformFunction) => new Transform({ objectMode: true, transform(chunk, _encoding, done) { done(null, transformFunction(chunk)); } }); await pipeline(Readable.from([` 0 true { "foo": "bar" } null `]), // three lines below do everything parser({jsonStreaming: true}), streamValues(), through(json => ({value: json.value})), // wrapper for null values, you can already return this stream as the result // this is just to collect results in an array for illustrative purposes through(wrapper => result.push(wrapper)) ) console.log(JSON.stringify(result, null, " ")); ```

 

Option 3: array output?

Wrap JQ output stream in an array to avoid issues with null and JS streams:

empty result => []
[] => [[]]
"foo",0,true => ["foo",0,true]
["foo",0,true] => [["foo",0,true]]

You can use the same code snippet above, but replace the last two transform streams with:

-    through(json => ({value: json.value})), // wrapper for null values, you can already return this stream as the result
-
-    // this is just to collect results in an array for illustrative purposes
-    through(wrapper => result.push(wrapper))
+    through(wrapper => result.push(wrapper.value))

     

Below are my super minor comments, please feel free to ignore:

Not so sure about this:

it just tries to parse as JSON and in case of failing it keeps it as the stdout (which is a string)

because I was able to get other types from you as well ;-)

> typeof await jq.run('.[0]',  [false],  {input: "json",   output: "json"})
> typeof await jq.run('.[0]', "[false]", {input: "string", output: "json"})
'boolean'

> typeof await jq.run('.[0]',    [1],    {input: "json",   output: "json"})
> typeof await jq.run('.[0]',   "[1]",   {input: "string", output: "json"})
'number'

 

Indeed, the empty string is an expected result from jq saying that there's no filters

But I totally forgot that a real empty string result will be surrounded by ", while an empty result won't, so I removed my comment from the issue. This also makes the solution quite simple


% echo '""' | jq '.'
""

% echo '""' | jq 'select(. != "")'
% echo '' | jq '.'
# literally nothing