Open Darksonn opened 3 days ago
IIRC string constants are always null terminated in the binary anyways.
I don't think they are
it's a constant with known length, I wouldn't consider "computing the length" to be a problem. e.g.:
pub struct Location<'a> {
file_with_nul: &'a [u8],
line: u32,
column: u32,
}
impl<'a> Location<'a> {
pub fn file(&self) -> &'a str {
unsafe { str::from_utf8_unchecked(self.file_with_nul.get_unchecked(..self.file_with_nul.len() - 1)) }
}
pub fn file_cstr(&self) -> &'a CStr {
unsafe { CStr::from_bytes_with_nul_unchecked(self.file_with_nul) }
}
}
one thing that changes though is that if we get an API to set the implicit #[track_caller]
argument, that we'd have to pass a &'static [u8]
in that has both the terminating nul and is utf-8, instead of the much more common &str
it's a constant with known length, I wouldn't consider "computing the length" to be a problem.
But it becomes a problem if we later go through with the size optimization from https://github.com/rust-lang/rust/pull/117431. Then, the length is no longer known, so it really does have to be computed by calling strlen
or similar.
It seems unfortunate if all the track_caller
data in the binary needs to be bigger for everyone just because some people want to pass it to a random C API sometimes.
Could we have file_cstr!()
instead, so only people dealing in C strings need to deal with it? Yes, that's not as nice as track_caller
, but oh well?
if combined with the optimization in Location
's size, it's probably smaller to use nul-terminated strings, since each string is only needed for a whole file and only needs one more byte whereas the size field is duplicated for every tracked location and is either 4 or 8 extra bytes in each one.
Using null terminated strings may also unlock linker string merging size optimizations, which could further decrease binary size. It seems unlikely to me that anyone cares about the tiny size increase - those who really really care about location size are gonna use something like -Zlocation-detail=none anyways, which deletes all this info.
Could we have
file_cstr!()
instead, so only people dealing in C strings need to deal with it? Yes, that's not as nice astrack_caller
, but oh well?
It would make a lot of things that could otherwise be function calls into macros. :(
Proposal
Problem statement
When using
#[track_caller]
in codebases that mix C and Rust, you often wish to pass the caller's filename to a C api. However, this usually requires a nul-terminated string.Motivating examples or use cases
I would like to utilize this in the Linux kernel to implement a Rust equivalent of the following utility:
It's essentially an assertion that crashes the kernel if a function is used in the wrong context. The filename and line number is used in the error message when it fails. Unfortunately, the
__might_sleep
function requires the filename to be a nul-terminated string.Note that unlike with things like the
file!()
macro, it's impossible for us to do this ourselves statically. Copying the filename at runtime into another string to nul-terminate it is also not a great solution because we need to create the string even if the assertion doesn't fail, as the assertion is checked on the C side.Solution sketch
Add a new function
core::panic::Location::file_with_nul
that returns a&CStr
instead of a&str
.This has the implication that the compiler must now always store a nul-byte in the filename when generating the string constants.
Alternatives
It could make sense to return
*const c_char
instead of&CStr
to avoid having to compute the length when all you need is a pointer you can pass into C code. This could be important as possible future work involves reducing the size ofLocation
by removing the length. In this case, the existingcore::panic::Location::file
function would be updated to compute the length using the nul-terminator. Right now, the&CStr
return value forces us to compute the length even when we don't need it.Links and related work
An implementation can be found at https://github.com/rust-lang/rust/pull/131828.
For more context, please see zulip and the Linux kernel mailing list. This is one of RfL's wanted features in core.
Adding a nul-terminator to the
Location
string has been tried before in https://github.com/rust-lang/rust/pull/117431. However, back then, it was motivated by reducing the size ofLocation
, and the previous PR did not actually expose the c string in the API.What happens now?
This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.
Possible responses
The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):
Second, if there's a concrete solution:
cc @ojeda @Noratrieb