`module!`: take strings instead of byte strings

Rust-for-Linux / linux

Adding support for the Rust language to the Linux kernel.

https://rust-for-linux.com

Other

3.91k stars 412 forks source link

`module!`: take strings instead of byte strings #252

Closed ojeda closed 1 month ago

ojeda commented 3 years ago

Given it is a proc macro, we could take normal strings for the fields and ensure they are ASCII if/where needed.

This would make the interface a bit leaner.

For author, it would be particularly fitting anyway, because we would like to allow names requiring UTF-8, such as non-romanized names. There are kernel modules with MODULE_AUTHORs with non-ASCII characters already (and encoded as UTF-8) e.g.

https://github.com/Rust-for-Linux/linux/blob/428d64a1920fbe6f5e0e609dbd2967d6c4462fbe/drivers/phy/ingenic/phy-ingenic-usb.c#L385-L386

nbdd0121 commented 3 years ago

This is actually non-trivial, because binary string constants cannot contain non-ASCII characters, so module proc_macro needs to properly parse the string literal and convert it to a binary string.

bjorn3 commented 3 years ago

str::as_bytes gives the raw UTF-8 bytes of a string. You can directly pass this to Literal::byte_string I believe. Rustc automatically escapes the individual bytes: https://github.com/rust-lang/rust/blob/bba8710616e5e4722215c0d6b27abaedca03ebad/compiler/rustc_expand/src/proc_macro_server.rs#L566-L572

nbdd0121 commented 3 years ago

str::as_bytes gives the raw UTF-8 bytes of a string. You can directly pass this to Literal::byte_string I believe. Rustc automatically escapes the individual bytes: https://github.com/rust-lang/rust/blob/bba8710616e5e4722215c0d6b27abaedca03ebad/compiler/rustc_expand/src/proc_macro_server.rs#L566-L572

However there is no way to get a String or Vec<u8> from a Literal other than parsing it yourself. It's easy once once it's parsed. If you look at #258 you'll see that's exactly what I do.

bjorn3 commented 3 years ago

Ah, I see what you mean. If licensing would allow it copying syn or rustc_ast would be the easiest solutions.

nbdd0121 commented 3 years ago

Are there scenarios that we need to take non-UTF-8 strings?

ojeda commented 3 years ago

I do not know, but even if we happen to need it, I do not think we should worry about that for the time being, in particular if it makes things more complex.

y86-dev commented 1 month ago

Stale issue: this is already the case.