Open meganwalker-ibm opened 5 years ago
How are you serializing to JSON? The following Javascript code:
const arr = [1,2,3];
arr.sequence=true;
JSON.stringify(arr);
produces
"[1,2,3]"
I believe the concern can be represented with the following: If you have
const jsonata = require('jsonata')
let input = {
"units": [
{
"name": "bob",
"price": 20
},
{
"name": "jessica",
"price": 500
},
{
"name": "may",
"price": 40
}
]
}
console.log(jsonata('units[price > 20]').evaluate(input))
then the output to the console (also seen in a Javascript debugger) is
[
{
name: 'jessica', price: 500
},
{
name: 'may', price: 40
},
sequence: true
]
where sequence: true
wasn't present in 1.6.3 and below. This does seem to go with JSON.stringify()
, but I guess the discussion is around whether evaluate()
should be returning this property in the object.
I guess the wider question is, should evaluate()
be returning valid JSON? I would have thought so, which suggests this should be cleaned up.
The sequence
property was always there, but it used to be created using Object.defineProperty
with enumerable
set to false. However, this had a significant performance hit, and rewriting it this way gave a 3x performance improvement on the expression $sum(Account.Order.Product.(Price*Quantity))
. If you can come up with an alternative solution that doesn't degrade performance, I'd welcome it.
I was aware that the sequence
property was there internally before as processing is taking place, and hence didn't feel concerned by the change in the PR, I didn't realise that this would have an impact on what was returned from an evaluate()
call, which is an external change. Having evaluate()
return a sequence
property is new behaviour in 1.6.4.
FYI, in my (TypeScript) implementation, I always "wrapped" the values in a wrapper where the sequence and a few other properties were properties of the wrapper, not of the actual value. This allowed me to include any such properties pretty easily. I would unwrap such values before calling functions, for example.
Now, that approach may have had a significant impact on performance, I have no idea. I didn't do any performance benchmarking.
Possibly related: I'm seeing keepSingleton: true
instead of sequence: true
in arrays of Strings.
EDIT: I'm noticing this because it's throwing off ava's t.deepEqual()
in my unit tests.
Hi Team,
We are facing the same issue, we are getting extra sequence key.
Could you please provide any updated status on this bug.
Thanks.
The 'status' is as discussed above. Any changes in this area could have significant performance impact. Before touching this, we need a framework for measuring performance so that we can be certain of no regressions.
These properties are only added to the Array object. When you JSON.stringify() the structure, they don't appear in the output.
The other option is to have a final "sanitization" step that makes sure that a clean array is returned.
function cleanOutput(raw) {
if (Array.isArray(raw)) {
return [...raw];
} else {
return raw;
}
}
This could be done as a final step in evaluate
, but it can also be done externally.
const jsonata = require("jsonata");
let input = {
units: [
{
name: "bob",
price: 20
},
{
name: "jessica",
price: 500
},
{
name: "may",
price: 40
}
]
};
const output = jsonata("units[price > 20]").evaluate(input);
output.keepSingleton = "bar" // For illustration only
console.log(output, Object.keys(output));
const clean = cleanOutput(output);
console.log(clean, Object.keys(clean));
The above code outputs
["0", "1", "sequence", "keepSingleton"]
["0", "1"]
This could provide an immediate workaround for:
keepSingleton
sequence
I'll leave the performance fix and potential for integration into core up to @andrew-coleman. If performance is a big concern, maybe we can get a jsonata->webassembly compiler written ;)
@loganvolkers : that doens't work for nested arrays, does it? How about:
cleanOutput = JSON.parse(JSON.stringify(raw))
I have been looking at this issue in related to #473.
I've got a simple performance test that runs the $sum(Account.Order.Product.(Price*Quantity))
expression 100000 times, across 10 separate runs, and then averages the time taken and number of evals/sec.
I have used that to compare some possible approaches to resolving this issue:
Object.defineProperty
to make the properties non-enumerable. As @andrew-coleman said, this is known to be slower, but I tried it out to get a sense of how much slower given the current code.WeakMap
- this uses a WeakMap in util.js
to track the 'known' sequence arrays, rather than having to flag the array itselfSubclass Array - create a custom class that extends Array, so we can use instanceof
to check its type, but still looks just like an Array.
class Sequence extends Array {}
Code | evals/sec |
---|---|
existing | 40891 |
Object.defineProperty |
29178 |
WeakMap |
23581 |
subclass Array | 30491 |
So yeah... no quick win with those options.
Also, having got this far, I realise there's more to it than just being able to identify it as a sequence - there is also the keepSingleton
property used in some places that gets added to the object. So I'm going to post this comment as a record of my failure to find a performant fix so far.
Thanks Nick! :)
Any update on this issue ..? I am getting sequence property in my arrays.
I have a problem when I create in jsonata field called "sequence", when I test it on jsonata playground it's working fine.
But when I do it in my App (JS) the field is not created.
If I rename field to something else like "seq" all is fine.
Can it be somehow realted?
+1 on this issue!!!
+1
We are also facing this issue, can this be fixed?
Please consider adding a sanitisation step before returning the data from JSONATA library, because JS test libraries fail without a clear reason due to this internal field. Currently we need to pass all the responses from JSONATA through JSON.parse(JSON.stringify
to workaround this issue
The
sequence
property was always there, but it used to be created usingObject.defineProperty
withenumerable
set to false. However, this had a significant performance hit, and rewriting it this way gave a 3x performance improvement on the expression$sum(Account.Order.Product.(Price*Quantity))
. If you can come up with an alternative solution that doesn't degrade performance, I'd welcome it.
@andrew-coleman please reconsider prioritising performance ahead of functionality/ease of usage. I sincerely doubt that the majority of your users will have issues with the library being 4x slower, whereas these implementation details seem to be causing a lot of issues based on the number of issues and comments about them
+1, this issue has been causing our test assertions (specifically, Vitest/Jest's .toBeCalledWith({ key: expect.any(String) })
to fail with baffling error messages like Compared values have no visual difference
, wasting hours of developer time.
We can use lodash.cloneDeep everywhere as a workaround, but we shouldn't have to - jsonata should be fixed to clean up these extraneous values, even at the cost of some microseconds for each call.
PR https://github.com/jsonata-js/jsonata/pull/292 included in JSONata 1.6.4 started polluting data returned to consumers with a 'sequence' property attached onto an array.
I don't think this internal property should be externalised to consumers; especially in a way that means it can't be serialised losslessly into a string.
The
sequence
property here is added by JSONata, not from the data we are working with, and it cannot be serialised correctly into JSON (Nor can the equivalent JSON string be parsed into javascript)Represents the returned data structure, except it's not valid JSON.