causal-agent / scraper

HTML parsing and querying with CSS selectors
https://docs.rs/scraper
ISC License
1.81k stars 100 forks source link

Error when selecting something with this syntax div.My(1) #84

Closed Raduc4 closed 1 year ago

Raduc4 commented 1 year ago

I wanted to scrape something from yahoo website and there are classes in the html like this My(8) P(w) ... and I'm getting an error using it.

thread 'scrape::scraper::tests::parse_titles_test' panicked at 'called `Result::unwrap()` on an `Err` value: ParseError { kind: Custom(ClassNeedsIdent(Function("My"))), location: SourceLocation { line: 0, column: 4 } }', src/scrape/scraper.rs:51:54
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test scrape::scraper::tests::parse_titles_test ... FAILED

failures:

failures:
    scrape::scraper::tests::parse_titles_test

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.97s

How can I fix this?

nathaniel-daniel commented 1 year ago

You can't use '(' as part of a css identifier: https://www.w3.org/TR/css-syntax-3/#typedef-ident-token. However, you can escape it using a backslash, so div.My(1) would become div.My\(1\).

Raduc4 commented 1 year ago

Thank you. I managed to scrape it without these classes, think your solution works.