Closed anikethsaha closed 3 years ago
Thanks for working on this!
Can we clone the config at a higher level and then pass that to the validator function?
By higher level, you meant where we are reading/getting the config? then I think it needs to pass two configs to the validator one cloned and one original. cause only the cloned one will be validated. the original one will be passed to the rules
Is there a reason we can't clone the config as soon as we have the full resolved config and use that for validation and then pass that to the rules?
i will try that ๐
@kaicataldo I updated the cloning to a higher level. I think I might have missed some other entry points for configs.
Also, do you think it would be helpful to add a clone here
instead
Like this
if (configData) {
const clonedConfigData = lodash.cloneDeep(configData);
return this._normalizeConfigData(
clonedConfigData,
plugin.filePath,
`${importerName} ยป plugin:${plugin.id}/${configName}`
);
}
@aladdin-add can you check if the current one is fine ?
let me know if this has to be changed. after that, I will add some tests
I'd suggest it in something like linter.js
, than in getRuleOptions
-- it could be cloned multi-times.
I'd suggest it in something like
linter.js
, than ingetRuleOptions
I think the current one in this PR is something like this.
earlier I planned to have it clone and validate in config-validator.js
alone. In this way, it will clone every config irrespective of where they are coming from.
But I think the current one is the entry point for the config (not for the directives,)
@kaicataldo can you please take a look if this is the fine place to clone?
@kaicataldo can you take a look at this again?
Apologies for the delay! I've been sick the last few weeks and am just getting caught up.
@kaicataldo can you check if your concerns have been addressed?
Do we have documented restrictions on what can be an option's value?
In the current version, a custom rule might expect something like a function, Date instance, RegExp instance, etc. to be defined in .eslintrc.js
as a rule's option. JSON schema doesn't support those types, but the schema
property and thus the validation is optional for rules.
After this change, values will be limited to JSON data types only.
This is a good point and is probably worth having a separate discussion about before we merge this in (or alternately find a different deep merge algorithm). Up until now, weโve wanted config to be JSON-serializable (or at least JSON-compatible) because we support JSON configs. That being said, we donโt know what custom plugins are currently doing, and itโs looking likely we will be implementing a new config file format that will not have this limitation (see RFC).
In the current version, a custom rule might expect something like a function, Date instance, RegExp instance, etc. to be defined in
.eslintrc.js
as a rule's option.
agreed, this change might break the option of this type.
I think we can discuss in a separate issue and hold this PR. or wait till the RFC is finalized/implemented?
Is there a reason we can't use use lodash.clonedeep
? I know the original implementation of this PR used it, but I'm not sure I understand why that wouldn't work in this case.
there was an issue with it ref this
I am not sure why is that behavior.
Huh, strange. Might be worth filing a bug report with lodash
, as that seems like an easy pitfall.
orth filing a bug report with
lodash
, as that seems like an easy pitfall.
not sure whether it's intentional or not. though I couldn't find anything on the docs about this.
By lodash/lodash#2725 it seems to be intentional, as probably the only way to support circular references.
As suggested in the issue, perhaps we could explore cloneDeepWith
? Maybe to start a new (independent) deep cloning whenever it encounters an object.
As suggested in the issue, perhaps we could explore
cloneDeepWith
?
I will check this
Maybe to start a new (independent) deep cloning whenever it encounters an object.
I didn't get this one. did you mean custom cloning implementation?
Maybe to start a new (independent) deep cloning whenever it encounters an object.
I didn't get this one. did you mean custom cloning implementation?
Partially custom: maybe we could call cloneDeepWith
again from its customizer
function when it gets an object, or something like that if it makes sense.
@mdjermanovic @kaicataldo I have changed the cloning logic with this. I seem working out. let me know if any more specific case I am missing or any suggestion would be great ๐
I was thinking about something like this, instead of maintaining our own deep cloning:
const lodash = require("lodash");
function getCustomizer() {
let root = true;
return function customizer(value) {
if (!root && typeof value === "object" && value !== null) {
// start new lodash.cloneDeepWith
return lodash.cloneDeepWith(value, getCustomizer());
}
root = false;
// return `undefined` for the default lodash.cloneDeepWith behavior
return void 0;
}
}
// test
const nested = {};
const obj = { x: nested, y: nested };
const clonedObj = lodash.cloneDeepWith(obj, getCustomizer());
console.log(clonedObj.x === clonedObj.y); // false
This is just an idea, I'm not sure about the performance. Also, this would cause "Maximum call stack size exceeded" if there are circular references.
@mdjermanovic The solution you mentioned is working as well ๐
I think both logic is nearly the same in terms of speed.
The one you mention might be ~0.10s faster (avg) - from tests
perf
script with current clone logiccloneDeepWith
customizer (https://github.com/eslint/eslint/pull/13034#issuecomment-629796697)I did run the perf
three times, and the result seems relative to this one.
Also, this would cause "Maximum call stack size exceeded" if there are circular references.
any way to reproduce this?
Let me know which one to keep ๐ .
IMO, as performance-wise, it seems lodash.cloneDeepWith
has a bit of better result.
I agree with @mdjermanovic that it would be better to use lodash instead of maintaining our own full deep cloning algorithm. If it's also more performant, it seems like the clear winner!
I was thinking if there was some way to test/fix this Also, this would cause "Maximum call stack size exceeded" if there are circular references.
issue.
Are there any practical cases where someone would want to include a circular reference? I'm having trouble coming up with any...
@anikethsaha To test it, you could add a circular reference inside settings, since this can contain arbitrary data.
I was thinking if there was some way to test/fix this Also, this would cause "Maximum call stack size exceeded" if there are circular references. issue.
It isn't fixable. It might be nice to report a meaningful error, but the amount of code and complexity to do that is most likely too high for something that normally shouldn't happen in the configuration.
Are there any practical cases where someone would want to include a circular reference? I'm having trouble coming up with any...
Probably not, just had to mention a possible downside (which also exists with the custom cloning algorithm in the actual iteration of this PR).
ok cool, I will change the current logic with cloneDeepWith
๐
I changed clone logic to the one @mdjermanovic suggested.
Working as expected ๐
I did run a test for circular ref with both logic and both are failing. Should add test for the circular ref ? I didn't add them because I thought it doesn't belong to the scope of this change!
let me know if that needs to be added
I did run a test for circular ref with both logic and both are failing. Should add test for the circular ref ? I didn't add them because I thought it doesn't belong to the scope of this change!
Can you clarify what you mean by this? Do you mean that you wrote a test that tried to clone a config that had circular references and it led to an infinite loop?
My preference would be to add some kind of safeguard for this. I don't feel comfortable knowingly adding a memory leak that could wreak havoc on automated CI boxes that might accidentally be implementing a config with a circular reference. One other thought I had - could we add a safeguard by somehow implementing detection of circular references that could allow us to break out of the cloning logic and use the original config instead (maybe with a warning message)? This would allow us to not break anyone's builds who might be accidentally relying on this behavior.
@nzakas @btmills @mysticatea Thoughts on this?
In summary, we currently run ajv
on the imported configuration, which can have the unintended consequence of mutating the configuration by reference when it sets the default values.
I suggested we try to clone the entire config object when it is imported so that there was no chance of ESLint's internals mutating the config object. We have a working fix for this in this PR, but this solution has the caveat of trying to clone circular refs leading to an infinite call stack.
The alternative to this would be to just deep clone the rules configuration, but I suppose this problem could technically occur there too.
Can you clarify what you mean by this? Do you mean that you wrote a test that tried to clone a config that had circular references and it led to an infinite loop?
I added the following eslint config and it was failing.
let errorConfig = ["error", {}];
class CircularRef {
constructor() {
this.self = this;
}
}
module.exports = {
settings: {
sharedData: new CircularRef()
},
rules: { camelcase: errorConfig, "default-case": errorConfig }
};
I couldn't see the error message as the stack was big and my ide (I am not using my local setup, though I can try ) didn't able to display the whole.
I don't feel comfortable knowingly adding a memory leak that could wreak havoc on automated CI boxes that might accidentally be implementing a config with a circular reference.
agreed ๐ฏ
could we add a safeguard by somehow implementing detection of circular references that could allow us to break out of the cloning logic and use the original config instead (maybe with a warning message)?
try-catch ?
Also, it looks like https://github.com/eslint/eslint/issues/13324 can be affected with this as well. ?
My preference would be to add some kind of safeguard for this. I don't feel comfortable knowingly adding a memory leak that could wreak havoc on automated CI boxes that might accidentally be implementing a config with a circular reference. One other thought I had - could we add a safeguard by somehow implementing detection of circular references that could allow us to break out of the cloning logic and use the original config instead (maybe with a warning message)? This would allow us to not break anyone's builds who might be accidentally relying on this behavior.
Maybe something like this:
const lodash = require("lodash");
function cloneDeep(val) {
const seenObjects = new Set();
let forceDefault = false;
function customizer(value) {
if (typeof value === "object" && value !== null) {
if (forceDefault) {
forceDefault = false;
} else {
if (seenObjects.has(value)) {
throw new Error("Cannot clone circular structure");
}
seenObjects.add(value);
forceDefault = true;
const clonedValue = lodash.cloneDeepWith(value, customizer);
seenObjects.delete(value);
return clonedValue;
}
}
// return `undefined` for the default lodash.cloneDeepWith behavior
return void 0;
};
return lodash.cloneDeepWith(val, customizer);
}
We call this with try-catch
and use the original config if it throws, maybe with a warning message.
@mdjermanovic @kaicataldo I used debug
for logging the warning though it won't be visible by default.
an alternative can be console.log
(not ideal) or process.stdout.write
.
let me know if the current one is fine
Maybe I'm missing something, but won't this not actually fix the specific specific problem we're trying to solve here, since it would silently reuse the original config object if there are any objects that are referenced more than once in the config (even if they're not necessarily nested within each other)? Seems like we need to actually do a full traversal of the finalized config object to check specifically for nested circular references before running the cloning algorithm. I'm not sure of the performance hit here, though, particularly for complex configurations.
Alternatives I can think of:
ajv
. While this issue could potentially still exist if a rule configuration had a nested circular reference, it's a lot less likely since we're dealing with a smaller subset of the full config. Apologies if we do end up doing this, because I know this was the original fix implemented and I suggested we go down the path of trying to clone the entire config.@anikethsaha Sorry for all the back and forth. Feel free to wait until we finalize the design before continuing to make changes.
2. Stick with this current iteration but actually throw an error and disallow rereferencing objects in configs entirely. This would be a breaking change, and feels like a bit of a kludge to me. I would like to avoid this.
agreed.
Seems like we need to actually do a full traversal of the finalized config object to check specifically for nested circular references before running the cloning algorithm. I'm not sure of the performance hit here,
yea, it would be a bit of heavy as what I understood it would mean traversing it twice.
maybe the solution of #13324 can be helpful here.
@anikethsaha Sorry for all the back and forth. Feel free to wait until we finalize the design before continuing to make changes.
Cool ๐
Maybe there's something missing in the algorithm, but this cloneDeep
was supposed to throw only if there are circular references. Otherwise, it should create multiple copies of the same object if it was referenced more than once.
I added the try-catch to not throw an error and just log (debug log) the error and continue with an un-cloned config.
Otherwise, it should create multiple copies of the same object if it was referenced more than once.
Should I change it to use the object returning from cloneDeep
instead of the original one ?
I would replace the throw when there is circular ref with a debug log !
const lodash = require("lodash");
function cloneDeep(val) {
const seenObjects = new Set();
let forceDefault = false;
function customizer(value) {
if (typeof value === "object" && value !== null) {
if (forceDefault) {
forceDefault = false;
} else {
if (seenObjects.has(value)) {
- throw new Error("Cannot clone circular structure");
+ debug("Cannot clone circular structure")
}
seenObjects.add(value);
forceDefault = true;
const clonedValue = lodash.cloneDeepWith(value, customizer);
seenObjects.delete(value);
return clonedValue;
}
}
// return `undefined` for the default lodash.cloneDeepWith behavior
return void 0;
};
return lodash.cloneDeepWith(val, customizer);
}
thoughts ?
Just to clarify what's the problem with the current iteration - would it fix the original issue?
Just to clarify what's the problem with the current iteration - would it fix the original issue?
yes , but if there is circular ref then it will use the original config instead of the cloned one so in that case it wont be fixed if there is a circular ref
Just to clarify what's the problem with the current iteration - would it fix the original issue?
yes , but if there is circular ref then it will use the original config instead of the cloned one so in that case it wont be fixed if there is a circular ref
I thought that was the goal - fallback to previous behavior only if there are circular refs, in order to avoid breaking builds with such configurations.
Additionally, we could improve the algorithm to return the already cloned object when it detects a circular reference (instead of throwing), while still creating multiple copies of objects that were referenced multiple times but are not part of any cycles.
I thought that was the goal - fallback to previous behavior only if there are circular refs, in order to avoid breaking builds with such configurations.
yeah, that was the idea.
@kaicataldo from https://github.com/eslint/eslint/pull/13034#issuecomment-632283007, it seems like the current is working as it was suggested right?
Even if the config is not cloned and the original conifg is passed on, user can notice the behavior and with this change using --debug
they can find out the reason for not working as intended as their config has a circular refs (which is ideally not recommendable unless there is some kind of tool like bundlers that handles that. ).
TBH, I have never seen some devs writing such a config, so I'm wondering if it was worthy to fix it, by adding a lot of complexity(and probably loss perf)!
what if we just:
ajv
, to see if it can be fixed in the upstream.I think @aladdin-add has a point here: it seems like Ajv modifying the object itโs validating is the root cause, so asking them to fix it might be the best path forward. We are a sponsor after all. :)
Pinged the Ajv creator on the issue: https://github.com/eslint/eslint/issues/12592#issuecomment-633759796
Yeah that would be an easy win ๐
@kaicataldo just to clarify, in the case that we eventually decide to clone the entire config:
Maybe I'm missing something, but won't this not actually fix the specific specific problem we're trying to solve here, since it would silently reuse the original config object if there are any objects that are referenced more than once in the config (even if they're not necessarily nested within each other)?
It should fix the specific problem (the original issue) since it does create multiple clones of objects that were referenced more than once. It will silently reuse the original config only if there are circular references.
Anyway, this code certainly adds some complexity and probably needs more tests.
There is also clone package as an alternative to lodash's clone, but I'm not sure if we want to add another dependency.
Can we simplify this a bit? Since rule configs are JSON-serializable, can we do JSON.parse(JSON.stringify(ruleConfig)). I believe doing so will automatically throw for circular references.
Prerequisites checklist
What is the purpose of this pull request? (put an "X" next to an item)
[ ] Documentation update [X] Bug fix (template) [ ] New rule (template) [ ] Changes an existing rule (template) [ ] Add autofixing to a rule [ ] Add a CLI option [ ] Add something to the core [ ] Other, please explain:
What changes did you make? (Give an overview)
Cloning the rule config and other configs before validating them as they get mutated with default values when used
useDefault : true
inagv
. I couldn't find any other entry point for this issue other than this, do let me know if any!Is there anything you'd like reviewers to focus on?
fixes https://github.com/eslint/eslint/issues/12592