Closed milahu closed 2 years ago
@milahu thank you for reviewing and finding this bug. I will be working again on "nwsapi" starting 1st week of June, I will hopefully be fixing all the pending issues including this one.
Sorry for the delay to everybody but too many things went wrong in 2021.
@milahu I went back through all the issues and I am now fixing them one by one. However, for this issue, I can't reproduce an infinite recursion loop, maybe the changes fixed it.
By using the suggested selector I get returned an error message like the following in the console:
document.querySelectorAll('_:-ms-fullscreen')
Uncaught DOMException: Document.querySelectorAll: '_:-ms-fullscreen' is not a valid selector
then the process terminates.
can you try ... ?
document.querySelectorAll('.some-class .another-class _:-ms-fullscreen')
@milahu I tried the suggested line you sent with both 2.2.0 (latest published) and with current repository but was unable to reproduce. Also what you said make sense I am sure that after the default/break of the inner loop I pop next selector from the list of fragments.
Could it be you tested that infinite loo with a previous version of "nwsapi" ? Maybe I am testing/doing something different than you. Could you post a small example showing the problem ? Also specify which "nwsapi" version you are using.
Thank you for your time and patience.
still a bug in nwsapi 2.2.0
the error is emit
ed over and over, but the function does not return
'.lmt--web .lmtside_container--target .lmt__ad_charLimitcontainer :-ms-fullscreen,:root .lmt--web .lmtside_container--target .lmt__ad_charLimit_container' is not a valid selector
easiest fix is to add return '';
in the switch (symbol)
default case
Hello!
I have the same issue using testing-library-react
. Currently using its methods I run into an issue where this error occurs. A string grows to millions of symbols in this block of code.
case '~': match = selector.match(Patterns.relative); source = 'n=e;while((e=e.previousElementSibling)){' + source + '}e=n;'; break;
easiest fix is to add
return '';
in theswitch (symbol)
default case
@Bondarenko199 does this help?
I agree with the fix proposed by @milahu, it certainly fixed my issue with no side effects.
fixed in ab9cde1eb05ec9badfc3abaf15687b1a6f9e9ad3
yay : )
@milahu sorry for the long delay. If you can, ensure my minimal variant really work as expected with no regressions.
all good ^^
i did not test your fix, but LGTM
the problem was that while { switch { break } }
did not break the while loop
using goto (break to label) is probably better than return '';
if selector
is truthy
edit: worst case, we still get an infinite loop ...
yepp, this fixes my e2e test in https://github.com/dperini/nwsapi/issues/46#issuecomment-867711752
short version:
const jsdom = require('jsdom');
test('nwsapi handles invalid selector', async () => {
const virtualConsole = new jsdom.VirtualConsole();
virtualConsole.sendTo(console);
const options = {
runScripts: 'dangerously',
virtualConsole,
};
const html = `
<!doctype html>
<html>
<head>
<style>
#e { color: green; }
body _:-ms-fullscreen { color: red; } /* this will trigger the bug in nwsapi */
</style>
</head>
<body>
<p id="e">Hello world</p>
<script>
var s = getComputedStyle(document.getElementById('e')); // force parsing of stylesheet
console.log('color = ' + s.color);
</script>
</body>
</html>
`;
const dom = new jsdom.JSDOM(html, options);
// https://gist.github.com/chad3814/5059671
function waitForDocument() {
return new Promise(resolve => {
dom.window.addEventListener("load", () => resolve());
});
}
var t1 = new Date().getTime();
await waitForDocument();
console.log(`time: ${(new Date().getTime() - t1) / 1000} sec to load document`)
});
to abort early, i added this to nwsapi.js
while (selector) {
console.log(`source.length = ${source.length}`);
if (source.length > 10000) {
// otherwise source grows to 1GB
throw `infinite loop? source.length = ${source.length}`;
}
cannot reproduce with unit test
Hello world
'); const NW = nwsapi(dom.window); var selector = '_:-ms-fullscreen'; // SyntaxError: unknown pseudo-class selector ':-ms-fullscreen' //var selector = '.lmt--web .lmt__side_container--target .lmt__ad_charLimit_container _:-ms-fullscreen'; // SyntaxError: unknown pseudo-class selector ':-ms-fullscreen' var mode = null; var callback = (...args) => console.dir(args); var testElement = NW.compile(selector, mode, callback); testElement(dom.window.document.documentElement); }); ``` 0195e47 and later throw ``` SyntaxError: unknown pseudo-class selector ':-ms-fullscreen' ``` b462258 and earlier throw ``` SyntaxError: 'undefined' is not a valid selector ```👏👏👏👏👏
@milahu I agree with that strategy I have never seen a resolver with a size > 1000 bytes. So a generic limit in size seems a good approach for various reasons, DOS included. I will run the entire testing to gather how long is the longest lambda we might have.
a size limit should be still large enough to handle edge cases
maybe 10MB?
some numbers on a 2.4GHz cpu
1 MiB limit → compileSelector throws after 0.05 seconds = 0.1% of max 10 MiB limit → compileSelector throws after 0.4 seconds = 1% of max 100 MiB limit → compileSelector throws after 4 seconds 512 MiB limit → compileSelector throws after 20 seconds 1024 MiB limit → compileSelector throws after 40 seconds
@milahu the following is the longest source string produced by the full w3c tests and it is 288 bytes long
if((((/^form$/i.test(e.localName)&&!e.noValidate)||(e.willValidate&&!e.formNoValidate))&&!e.checkValidity())||(/^fieldset$/i.test(e.localName)&&s.first(":invalid",e))){n=e;while((e=e.parentElement)){if((/^patternConstraints$/.test(e.getAttribute("id")))){r[++j]=c[k];continue main;}}e=n;}
the above source string when converted to a Resolver() becomes 377 bytes long
function Resolver(c,f,x,r){var e,n,o,j=r.length-1,k=-1;main:while((e=c[++k])){if((((/^form$/i.test(e.localName)&&!e.noValidate)||(e.willValidate&&!e.formNoValidate))&&!e.checkValidity())||(/^fieldset$/i.test(e.localName)&&s.first(":invalid",e))){n=e;while((e=e.parentElement)){if((/^patternConstraints$/.test(e.getAttribute("id")))){r[++j]=c[k];continue main;}}e=n;}}return r;}
So I believe that doubling it might be enough or maybe make it 4 times bigger. I wouldn't suggest going to more than that (max. 2048) to make it useful.
A selector yielding a 10Kb source string is a non-sense, imagine 1 MiB or 10 MiB ....
So a generic limit in size seems a good approach for various reasons, DOS included.
this also works in the opposite direction: an attacker can construct a "too long" css selector to crash nwsapi for example to prevent webscraping with jsdom web browsers should be error-tolerant
@milahu you surely know better than me those scraping bots and hackers behaviors too. However, just so I make sure I get your point, can you show me a selector that make sense and produces more than 2048 byte source string ?
var selector = 'body';
//for (var i = 0; i < 1000; i++) { // 12KB selector // RangeError: Maximum call stack size exceeded
for (var i = 0; i < 820; i++) { // 10KB selector -> source.length = 77017
selector += ` > .class-${i}`;
}
console.log(`selector.length = ${selector.length}`);
so its even easier to crash by stack size
an attacker can construct a "too long" css selector to crash nwsapi for example to prevent webscraping with jsdom
i have to admit, for scraping hostile sites i would not use jsdom rather a real browser via selenium / puppeteer / playwright ...
@milahu we have to distinguish reality from fantasy. I believe the tools you mention are perfect for such long selectors. I really hope nwsapi is used with a complete different scope in mind. You are talking about selector length now not by resolver "source" length. In this case I would also set a limit to selector length and return an empty Array. This will increase speed of the selector engine 10.000 time on these kind of sites. ;) Thank you for the head up with the infinite recursion it is now fixed and I will close the issue.
Could you please create a release for this bugfix?
function compileSelector in nwsapi.js
source = 'n=e;while((e=e.parentElement)){' + source + '}e=n;';
source grows to infinite length, untilRangeError: Invalid string length
normal
source.length
are around 1000 bytes with infinite loop,source.length
grows to the maximum string length of 1GBthe selector
_:-ms-fullscreen
comes from deepl.com/css/lmt.$d783c6.css.lmt--web .lmt__side_container--target .lmt__ad_charLimit_container _:-ms-fullscreen
the problem is
which only breaks the
switch (symbol) {
block, but not thewhile (selector) {
loop -> deadloopthe parser should seek to the next comma or to end of selector string
workarounds: