Open Fishrock123 opened 3 years ago
How feasible would it be to make Url::parse
a const fn
?
This morning I had no recollection that this ever existed. I did some archeology but it’s all public on github…
Url
struct has private fields and no constructor that doesn’t do parsingIf the only thing you’re worried about is typing Url::new("…").unwrap()
a lot, consider adding a fn url(s: &str) -> Url
function to your own crate that does that. It doesn’t even need to be a macro.
If you want to do compile-time parsing, these days you’d use a procedural macro rather than a compiler plugin. But more importantly: is that really useful, compared to lazy_static
or once_cell
?
How feasible would it be to make
Url::parse
aconst fn
?
Not feasible in today’s Stable Rust. The first reason that comes to mind is that parsing allocates a String
and returns it in a field of Url
, but there are probably a bunch of other blockers.
But more importantly: is that really useful, compared to lazy_static or once_cell?
I think it would be useful to check that the url is valid at compile time rather than runtime. Although I suppose unit tests could catch that.
no constructor that doesn’t do parsing
Why does no such constructor (or builder pattern) exist?
The set of fields and their types in the Url
struct are private implementation details that I’d rather not stabilize.
If the only thing you’re worried about is typing Url::new("…").unwrap() a lot, consider adding a fn url(s: &str) -> Url function to your own crate that does that. It doesn’t even need to be a macro.
Ideally whatever it is is easy enough for newcomers to type but aside from that I don't care about what you actually have to type.
My issue here is when that .unwrap()
could panic.
If you want to do compile-time parsing, these days you’d use a procedural macro rather than a compiler plugin. But more importantly: is that really useful, compared to
lazy_static
oronce_cell
?
Yes, because both of those are evaluated lazily, as their names imply (i.e. once_cell::sync::lazy
). This means that it could panic at any arbitrary time if someone miss-spells anything in the URL, particularly if that branch is not used often.
For example, in an internal bot we have something like this this:
lazy_static! {
//... more fields
static ref SLACK_POST_MESSAGE: Url =
Url::parse("https://slack.com/api/chat.postMessage").expect("Valid Slack Post Message URL");
}
This somewhat unfortunately is very easy to type - I say unfortunately because I suspect engineers that I will be bringing onto rust projects internally will not immediately grasp what is happening here and will assume it works similar to just a statically declared const in a scripting language, i.e. evaluated before main, which it is not.
I'm not sure what the best way on this would be but to my imagination whatever it is would produce something like Url::from_parts(Protocol::HTTP, domain_bytes, path_bytes, vec_query_field_bytes)
"under the hood".
The URL parser is rather liberal in what string inputs it accepts. Url::parse("htttps://slck.com/…")
for example will return Ok
, but probably won’t do what you want. Of all things that can possibly go wrong, the parser returning an error is not particularly interesting in my opinion.
Url::from_parts(Protocol::HTTP, domain_bytes, path_bytes, vec_query_field_bytes)
That is not how it works. A fair amount of normalization is done during parsing (as specified in https://url.spec.whatwg.org/) so you can’t just take arbitrary bytes (or rather Unicode strings) and put them it. And the internal memory representation is not separate strings but a single string (that matches the serialization) together with some integer indices.
Just following up on this, I think this is a great candidate to make into a const fn like @tmccombs pointed out. For many SDKs and API client libraries, the library authors know exactly what URLs they plan to support. These URLs are usually represented as &str
s and hardcoded.
Turning this into a const fn
allows the user to validate the URL at compile-time.
For example, today I wrote the following code:
const AccountBalance: &'static str = concat!("https://api.kraken.com/0", "/private/Balance");
const Assets: &'static str = concat!("https://api.kraken.com/0", "/public/Assets");
const SystemStatus: &'static str = concat!("https://api.kraken.com/0", "/public/SystemStatus");
const SystemTime: &'static str = concat!("https://api.kraken.com/0", "/public/Time");
const Ticker: &'static str = concat!("https://api.kraken.com/0", "/public/Ticker");
I'd like to take this one step further by parsing the URL at compile-time, that way I don't have to call a helper function every time I want to upgrade a &'static str
to a Url
.
The first reason that comes to mind is that parsing allocates a
String
This is easily solved by changing the payload to a Cow<'static, str>
, which is probably worth doing anyway, since so many URLs are hardcoded into the program. As for the rest of it- even if there's plenty of post-parse things that can go wrong (where a URL is "valid" but can't be used to make an HTTP request, for instance), there's still value in learning as soon as possible (eg, at build time, as a compile error) that a hardcoded URL parse will fail.
I think the Url::from_parts
is probably thinking along the right lines; but we'd instead need a #[doc(hidden)]
version that takes pre-computed integer indices. These could then be computed by the macro during compilation (and fail with a build error if there's a problem) and fed into an infallible from_parts
constructor.
The String
stored in Url
is not necessarily the same that was originally given to parse
, I’m not sure how effective Cow
would be
Sure, but it's often the same as that was originally given to parse.
This is annoying.
The url!
macro for compile-time validation is now available via url-macro
crate.
Disclaimer: I'm the author.
Use Urls any amount in rust and You'll end up with a decent bit of
Url::new("string literal").unwrap()
- it would seem a macro of some sort would be better for that, and a search shows that this indeed was a thing: https://github.com/servo/rust-url/pull/137I couldn't find any issues/prs for removing it, so, what happened?
I've also ended up writing static urls with e.g.
lazy_static!
a bunch, which is also not ideal, since they are not evaluated at either compile or startup.