Closed ArtemGolovko closed 5 months ago
I'm not able to reproduce this. I have a test passing that looks like this
#[test]
fn parse_should_handle_multiple_nots_in_predicate() {
// arrange
let xpath_text = r###"//head//script[not(@src) and not(contains(., "gtag"))]"###;
// act
let xpath = crate::xpath::parse(xpath_text).unwrap();
// assert
assert_eq!(xpath.to_string(), xpath_text);
}
What function were you calling exactly to cause the stack overflow? Is the overflow when you are applying the expression maybe?
I'm sorry I did not include bug reproduction.
Here it goes.
use skyscraper::{html, xpath::{self, XpathItemTree}};
fn main() {
let document = html::parse(include_str!("../annupay.htm")).unwrap();
let xpath_item_tree = XpathItemTree::from(&document);
let xpath = xpath::parse(r###"//head//script[not(@src) and not(contains(., "gtag"))]"###).unwrap();
}
You can comment out the last line of main and see overflow go away.
I think instead of my html file (annupay.html), any other html file, of similar size, can be used. In my case my html file is 12 kb. If it is not the case, I will make sure to investigate my html file.
Strange, I created a new rust project and used your exact code with some random html file and it works fine for me.
Does it fail for you with just the third line?
fn main() {
let xpath =
xpath::parse(r###"//head//script[not(@src) and not(contains(., "gtag"))]"###).unwrap();
}
You're right, it's the last line of main, that causes the issue. The bug appears in debug build, but not in release build.
Interesting, I think that means your right on the stack size limit and the release mode is optimizing it enough to fit in your stack. What kind of machine are you running the code on, is it a normal PC or some smaller device?
Try running the code in a thread with a higher stack size like in this stack overflow response https://stackoverflow.com/a/44042122
Thank you, will try increasing the stack size.
And here is my stats.
Stat | Value |
---|---|
CPU | Intel Core i5-6200U |
RAM | 12 GB |
OS | Windows 10 |
rustc version | 1.74.0 |
I don't think that my hardware is the problem. It might be a rustc bug.
I check the stack size required is 1 mb 65 kb. For some reason in debug mode it fails to correctly allocate stack.
I am able to reproduce this on Windows. I'll see what I can do.
I think I have fixed this in #27
Try using version 0.6.1-beta.0
and let me know.
Thank you. I test your fix and it works fine.
I encountered stack overflow when parsing
//head//script[not(@src) and not(contains(., "gtag"))]
;I suspect that the parser might get in infinite loop. Or any long xpath expression can cause stack overflow.