Open latot opened 3 months ago
You can continue to use error propagation, you just need to downgrade the to a string, e.g.
let selector = Selector::parse(&query).map_err(|err| err.to_string())?;
but note that for selectors known at compile-time, .unwrap()
is somewhat reasonable as failing to parse such a selector is a logic error instead of a runtime error. (For your example of a dynamically generated selector, one would have to consider how much leeway there is, i.e. if you really only interpolate an u8
into the nth-child
part, then .unwrap()
might still be reasonable as calling code should be unable to produce an invalid selector by choosing the value of the u8
.)
Hi! thx for the answer.
Well, is not ideal change the error type for the error propagation, would be good be able to return the actual one.
unwrap
, I think it would depends on the type of app you are constructing, if is a one that only you will use, is more reasonable to panic, just for a user end product we want to handle the error and do not to panic.
unwrap, I think it would depends on the type of app you are constructing, if is a one that only you will use, is more reasonable to panic, just for a user end product we want to handle the error and do not to panic.
This is not the distinction I am making: A CSS selector that is fixed when the program is built is not a runtime error but a construction error, i.e. a panic (that should never happen once the program is shipped) is entirely appropriate.
(Think of it this way, we could with sufficient effort create a proc macro which parses such a CSS selector during compilation and embeds some already validated form of it into the program so that no parsing at all needs to happen when it runs. Alas, just using .unwrap()
is a convenient way to get almost the same result with very much less implementation effort.)
I agree, this is because there is at least two types of CSS selectors.
I one hand, the ones that can be checked at compile time, here unwrap
is good, and the best solution is a macro as you says, this could be a nice feature :)
The next one is more complex css selectors, where the string is created by some algorithm, which is impossible to be checked by at compile time, can only be checked at runtime, there the error must be propagated, ideally the original not, not a string for a good error handling.
The idea of a macro is indeed interesting, I will leave this issue open in order to remember it. @cfvescovo
Take a look at #53
Since we had more requests for a macro implementation, maybe it's now time to add it to scraper
53 is interesting because it explores the idea of a proc-macro. I think that, if it worked, it could be a cleaner solution than what we have right now. In that discussion, though, concerns had been raised about the necessity of implementing such a macro
At the time, I chose to close #53 because another crate was actively being developed to address the issue. Now the repo of that crate is inactive.
I think we should provide an "official" implementation so that we can maintain and improve it over time. Moreover, it could be useful for beginners who don't want to roll out their own macro
This claude-generated solution looks very clean and idiomatic to me, even much better than the one provided in the crate scraper-proc-macro
use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, LitStr};
#[proc_macro]
pub fn selector(input: TokenStream) -> TokenStream {
let selector = parse_macro_input!(input as LitStr).value();
match scraper::Selector::parse(&selector) {
Ok(_) => quote!(
::scraper::Selector::parse(#selector).unwrap()
).into(),
Err(e) => syn::Error::new(
proc_macro2::Span::call_site(),
format!("Failed to parse CSS selector: {}", e)
).to_compile_error().into(),
}
}
If integrated, would you put it inside the selector module or in a dedicated macro module? It could also be integrated as an additional macro
feature
If integrated, would you put it inside the selector module or in a dedicated macro module? It could also be integrated as an additional macro feature
I imagine it would sit behind a feature flag in a module? The implementation would probably be in a separate crate.
I was reading that a proc_macro
needs to live necessarely in a separate crate, so yes, this is true, it would be quite invasive as an addition though, but yet useful
In order to avoid cyclic crate dependencies it would be better to ship scraper_proc_macros
separately. Otherwise scraper
would have to depend on it, which is not possible. Correct me if I'm wrong.
In order to avoid cyclic crate dependencies it would be better to ship scraper_proc_macros separately. Otherwise scraper would have to depend on it, which is not possible. Correct me if I'm wrong.
It is possible, it will just require scraper_proc_macros
to be published before scraper
when doing releases in future?
This claude-generated solution looks very clean and idiomatic to me, even much better than the one provided in the crate scraper-proc-macro
I am not sure if I find that proc macro particularly helpful as while it does verify at compile time, it will just compile the selector again at runtime. If the dependency tree is sufficiently messy, it could even be that the verification at runtime uses a different version of the scraper
crate and compilation at runtime still fails (admittedly a somewhat unlikely scenario).
I think what we should aim for is a proc macro that actually outputs a representation (maybe via serialization) of the already parsed selector so that work is not repeated at runtime.
I also think the span in the error case should tightened to indicate only the literal, not the whole macro invocation though.
The macro should generate a selector representation directly, without requiring parsing at run time. Moreover, while it's true that the basic approach would require cyclic dependencies - which, as you now, are not allowed - the issue can be solved by writing three crates: selectors_core, selectors_macros (which will depend on core) and selectors, which will depend on both. Obviously, as this sounds a bit overcomplicated, feel free to suggest more straightforward approaches
A simpler solution would be writing our macros in a different crate and let our users depend on this new crate in order to use them. However, I would prefer exposing our new macros directly from scraper behind a feature flag
The macro should generate a selector representation directly, without requiring parsing at run time.
I confirm that the macro provided by Claude executes the parser at runtime, but it then re-executes it at compile time, so the concern that @adamreichold is raising is real, even though I personally thing that such a scenario would be highly unlikely, and it would still be an improvement over not having the macro at all.
The selectors_core, selectors_macros solution would eliminate the version coherence problem between the compile time and the runtime versions of Selector::parse()
. Therefore I would prefer this solution over the one adopting indipendent crates.
I still does not know much about macros, but it should only accept strings, not variables as inputs, or will cause the error posted above (first post).
Maybe after this, create a issue for css selectors from variables.
Unfortunately, since SelectorsList
does not implement ToTokens
, we cannot reuse the result computed by parse
at complite time. We would need to submit a PR to the maintainers of the selectors
crate, for this feature to be implemented. As of now, I would still like to have the proc macro which recomputes the selector at runtime, but if we choose not to do that, I would close this issue.
We could also probably implement the parse logic inside a normal macro, but that is beyong the scope of this crate.
I think setting up the crate structure and using the runtime-based implementation would be fine until we can modify selectors
to support an embeddable representation.
To be honest I'm not 100% sure what benefit this macro confers? The syntax of CSS selectors is relatively straightforward and I think most people have to test their selectors anyway (to ensure they're picking thet right elements), something which this doesn't help.
Perhaps a builder might work better here?
The syntax of CSS selectors is relatively straightforward and I think most people have to test their selectors anyway (to ensure they're picking thet right elements), something which this doesn't help.
I agree that most problems we see during actual usage are not due to incorrect selector syntax but rather selectors not working as intended against the target DOM. The feedback for invalid syntax from a compile-time error is slightly faster though.
The macro (in its final form, not the intermediate check-at-compile-time-then-parse-again-at-runtime variant) would additionally provide efficiency gains insofar selectors would not have to parsed at runtime at all.
There is some advantages.
While more clear, better :3 at least is what I think.
Hi, I has been using this library from a while, has been great.
Here a example:
This function will fails, the reason is pretty simple, the function owns the query, but the error triggered by parse seems to have borrowed the string, so it can't return an error when the function owns the data.
The workaround right now is unwrap instead use error propagation.
There is some ways to handle this, one would be a parse with String which owns the data, other one is instead use a borrowed string for the error, convert &str to String and use that instead.
Thx!