Open chinedufn opened 5 months ago
Also tagging T-lang on this because this involves user-observable changes to the dead_code
lint. Feel free to readjust as appropriate.
Just updated to the latest nightly and came across another similar regression:
// src/lib.rs
pub mod module {
#[repr(C)]
pub struct Foo(u32);
}
No warning on stable 1.79.0
Has a warning on latest nightly (I haven't bisected)
Stable (no warning) -> https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=08aa241f1e1544fae1c50f66d16c94f6 Nightly (warning) -> https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=9cacd97f9ce149c18cfd8f2e5052032a
Here is the output on nightly-2024-07-02
:
Compiling playground v0.0.1 (/playground)
warning: struct `Foo` is never constructed
--> src/main.rs:4:16
|
4 | pub struct Foo(u32);
| ^^^
|
= note: `#[warn(dead_code)]` on by default
warning: `playground` (bin "playground") generated 1 warning
Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.38s
Running `target/debug/playground`
Ah, actually the pub mod my_mod { .. }
wrapper isn't necessary to reproduce the warning.
Here's an illustration of code that we don't want to lead to a warning, but does on latest nightly.
// crate `foo`'s src/lib.rs
#[repr(C)]
pub struct Foo(u32);
impl Foo{
pub fn print(&self) {
println!("{}", self.0)
}
}
It isn't dead code because another crate might use Foo
like this:
// In another crate that depends on crate `foo` above
#[no_mangle)
pub extern "C" fn print_foo(foo: Foo) {
foo.print();
}
Regarding the original report, this doesn't seem related to extern "C"
. E.g., this also warns:
mod private_mod {
pub enum MyEnum {
Variant(u32)
//~^ WARN field `0` is never read
}
pub fn some_function() -> MyEnum {
MyEnum::Variant(5)
}
}
fn main() {
let _ = private_mod::some_function();
}
And the warning is correct. The field is never read. If you read the field, the warning goes away, both here and in your original example.
Regarding your second example, the warning there also seems correct. The Foo
struct has a private field, and it's never constructed. If you make that field public, the warning disappears:
// lib.rs
pub struct Foo(pub u32); //~ No warning
If this analysis understands your report correctly and you agree there's no issue here, then please close the issue.
The analysis misunderstands the problem.
In both my first and second example there is a dead code warning for code that is not dead (note that this is the dead_code
lint emitting the warning).
The reason that the extern "C"
is important is that it's what makes the field alive.
In your example code that removes the #[repr(C)]
the enum's field is never read.
If you add #[repr(C)]
and pass it over FFI then the field might be read by a caller (i.e. a C program) that Rust cannot see.
Recall that the error says:
warning: field `0` is never read
When the #[repr(C)]
is present, Rust cannot know this.
For instance, a C program might be making use of that field 0
.
I'm going to add a #[repr(C)]
to your pub struct Foo(pub u32)
to fully illustrate why it does not solve the problem.
mod foo {
#[repr(C)]
pub struct Foo(pub u32);
impl Foo {
pub fn new(val: u32) -> Self {
Self(val)
}
}
pub extern "C" fn return_foo() -> Foo {
Foo(5);
}
// Doubles the value held inside `Foo`
extern "C" {
fn double_foo(foo: Foo) -> Foo;
}
}
fn main() {
let foo = Foo::new(10);
// Good. We only want Rust callers to be able to double `Foo`.
// We don't want them to be able to do anything else.
let foo = unsafe { foo::double_foo(foo) };
// BAD: We don't want `Foo`'s internal to be accessible outside of `mod foo`.
// To solve this, we need to change `pub struct Foo(pub u32)` to `pub struct Foo(u32)`.
foo.0 = 18;
}
In this example your suggestion changes the semantics of the program.
Foo(pub u32)
has broken our intended invariant (Rust users are only able to double the inner value).
The simplify:
The reason that this bazz
function below is not dead code is that it might be called by a caller that rustc
cannot see.
// src/lib.rs
pub function bazz() {}
Similarly, the reason that the #[repr(C)]
enum variant is alive in my examples above is alive is that it might be used by a caller that rustc
cannot see.
Note that this gives us warning:
mod foo {
#[repr(C)]
pub enum MyEnum {
Variant(u32)
}
#[no_mangle]
pub extern "C" fn make_enum() -> MyEnum {
MyEnum::Variant(10)
}
}
fn main() {
But this does not warn:
// src/lib.rs
pub enum MyEnum {
Variant(u32)
}
In the second example rustc
is properly recognizing that MyEnum::Variant
might be used by a downstream crate.
Similarly, the pub extern "C" fn foo() -> MyEnum {}
with #[repr(C)] enum MyEnum { Variant(u32) }
means that MyEnum::Variant
might be used by some FFI caller.
Hoping that all of the above clearly illustrates the problem.
Regarding your first example, is it correct to say that you believe that this should not warn?:
// lib.rs
#[repr(C, u8)]
enum MyEnum {
Variant { inner: u8 },
//~^ WARN field `inner` is never read
}
#[no_mangle]
extern "C" fn some_function() -> MyEnum {
MyEnum::Variant { inner: 0 }
}
...under the interpretation that the #[no_mangle]
should act like #[used]
and ensure that some_function
ends up in and exported in the object file, and that some program could link this object and some C function could therefore call some_function
, get a value of type MyEnum
, and, under the RFC 2195 semantics, read the field in that variant?
(According to Godbolt, this program has always produced a warning, which isn't to say whether that's correct or not.)
Regarding your second example, is it correct to say that you believe that this should not warn:
// lib.rs
#[repr(transparent)]
pub struct Foo {
//~^ WARN struct `Foo` is never constructed
inner: u8,
}
impl Foo {
pub fn print(&self) {
println!("{}", self.inner)
}
}
...because a downstream crate could do this?:
fn main() {
let x: Foo = unsafe { core::mem::transmute(0u8) };
x.print();
}
Note that this warning is in beta as well as nightly.
You are getting //~^ WARN field inner is never read
because the function isn't public.
The following does not give a warning:
[toolchain]
# arbitrarily chosen nightly from before this regression
channel = "nightly-2024-03-01"
// src/lib.rs
// This DOES NOT give a warning (`nightly-2024-03-01`)
#[repr(C, u8)]
pub enum MyEnum {
Variant { inner: u8 },
}
#[no_mangle]
pub extern "C" fn some_function() -> MyEnum {
MyEnum::Variant { inner: 0 }
}
I also checked nightly-2018-03-01
and the code block above (... Variant { inner: u8 }
...) did not produce a warning.
So, this has all worked properly for at least the last 6 years until a recent regression.
For your second question about the #[repr(transparent)]
struct:
According to the nomicon, "the goal of repr(transparent) is to make it possible to transmute between the single field and the struct/enum".
So, since #[repr(transparent)]
exists to facilitate transmutations, if a type #[repr(transparent)]
type is publicly accessible then its field should be considered alive.
#[repr(transparent)]
signals that we intend for the type's field to be accessed in a way that rustc
cannot see.
More simply:
#[repr(transparent)]
as #[used]
if the type CAN be accessed outside of the crate#[repr(transparent)]
as #[used]
if the type CAN NOT be accessed outside of the crate.Note that your #[repr(transparent)]
example warns in nightly-2024-07-01
but does not warn in nightly-2024-03-01
(I chose those two dates 2024-03-01
arbitrarily).
The following does not warn in nightly-2024-03-01
but does warn on nightly-2024-07-01
.
// lib.rs
pub struct Foo {
inner: u8,
}
impl Foo {
pub fn print(&self) {
println!("{}", self.inner)
}
}
This is an improvement. inner
should be considered dead here.
The following would solve the original issue:
MyType
is used as an argument or return value in a #[no_mangle] pub extern "C" my_func(..) -> ..
, then all of MyType
's fields are aliveThen to fix #[repr(transparent)]
liveness checks:
#[repr(transparent) FooType
is visible outside of the crate then FooType
's field is aliveRegarding the first example, with your revision:
// lib.rs
#[repr(C, u8)]
pub enum MyEnum {
Variant { inner: u8 },
}
#[no_mangle]
pub extern "C" fn some_function() -> MyEnum {
MyEnum::Variant { inner: 0 }
}
This does not warn in the latest nightly either:
So I'm unclear what you're trying to say here, and I'm still curious if you think the version I gave should warn, as that's spiritually similar to your original example that used a private module.
Regarding the second example:
// lib.rs
#[repr(transparent)]
pub struct Foo {
//~^ WARN struct `Foo` is never constructed
inner: u8,
}
impl Foo {
pub fn print(&self) {
println!("{}", self.inner)
}
}
Since inner
is not a public field, it seems valid to me for something within this crate to construct a value of type Foo
with unsafe { core::mem::transmute::<u8, Foo>(0) }
, but it doesn't seem valid for another crate to do that. You suggest that this should be valid:
So, since
#[repr(transparent)]
exists to facilitate transmutations, if a type#[repr(transparent)]
type is publicly accessible then its field should be considered alive.
Can you point us at any documentation or earlier decisions that we've made that would suggest that this should be supported?
Oops, I meant:
// warns on 2024-07-01
// does not warn on 2024-03-01
mod private_mod {
#[repr(C, u8)]
pub enum MyEnum {
Variant { inner: u8 },
}
#[no_mangle]
pub extern "C" fn some_function() -> MyEnum {
MyEnum::Variant { inner: 0 }
}
}
The above warns in 2024-07-01
but not in 2024-03-01
.
Regarding your question about #[no_mangle] extern "C" fn some_function() -> MyEnum
(without the pub keyword)
some_function
is not reachable outside the cratesome_function
is not reachable outside the crateI don't not whether a non-pub #[no_mangle] extern "C" fn some_function() -> MyEnum
is stripped from the final library.
I can't point to earlier documentation or decisions.
Your counterpoint is reasonable. I would have to think more deeply about this to know where I stand.
This is not a problem that I have faced. I do not currently have evidence of real code that would run into this #[repr(transparent)]
warning.
I'd like to remain focused on #[repr(C)]
types that are exposed in a pub extern "C' fn
.
Reasons:
#[allow(unused)]
from my code generation.So, here is exactly what I would prefer this issue to remain focused on:
// This should not warn
mod private_mod {
#[repr(C)]
pub enum MyEnum {
Variant(u8)
}
#[no_mangle]
pub extern "C" fn some_function() -> MyEnum {
MyEnum::Variant(5)
}
}
If we agree that this should not produce a warning, then I'd like to move forward to selecting a solution.
If you believe that this should produce a warning, please state why. I am unclear on where you stand on this specific example.
Here's a comparison of the behaviors of some different cases:
I would expect E1
, E2
, S1
, and S2
to have the same behavior, whatever that is. And similarly, I'd expect E3
, E4
, S3
, and S4
to have the same behavior.
I'd be interested to hear whether there are any reasons that the first set (E1
, E2
, S1
, and S2
) and the last set (E3
, E4
, S3
, and S4
) should have different behaviors from each other (e.g. perhaps related to RFC 2145).
I'm interested to hear what others think here about the situation and the proper behavior.
cc @petrochenkov @shepmaster @WaffleLapkin
cc @mu001999
Before reading the comments deeply, these are my opinions:
enum MyEnum { Variant(u32) }
enum MyEnum { Variant { name: u32 } }
struct MyStruct(u32)
struct MyStruct { name: u32 }
#[repr(C)]
should have no impact on the dead_code
lint. repr(C)
only controls layout. You can have repr(C)
types that are not exported.extern "C"
should have no impact on the dead_code
lint. extern "..."
only controls ABI. You can have extern "C"
functions that are not exported.#[no_mangle]
should probably have an impact on the dead_code
lint. Note that #[no_mangle] fn some_function() {}
causes some_function
to be present in the assembly regardless of (a) debug / release mode, (b) if the function is pub
or not, (c) if the function is extern "C"
or not.There is a related issue (#126169) about the example in https://github.com/rust-lang/rust/issues/126706#issuecomment-2206882374
2. `#[repr(C)]` should have no impact on the `dead_code` lint. `repr(C)` only controls layout. You can have `repr(C)` types that are not exported.
@shepmaster IIUC, this is not true, there is some logic for repr(C)
in the dead_code
lint, you can search unconditionally_treat_fields_as_live
in the code.
In short, fields will be marked live if there is a repr(C)
. But I don't know why this doesn't work for the examples in the comments.
IIUC, this is not true, there is some logic for
repr(C)
in thedead_code
lint
You are right, I should have been more specific — I wish that the lint's logic did not involve repr
at all, but it already did and then we extended that.
@shepmaster oic, I agree with https://github.com/rust-lang/rust/issues/126706#issuecomment-2214137079. And in fact repr(C)
implies more semantics.
It seems like having #[no_mangle]
make a #[repr(C)]
type count as used
is uncontroversial so far?
As in
#[no_mangle] fn some_function(arg: ReprCType1) -> ReprCType2
ReprCType1
and ReprCType2
would never emit a dead code warning@mu001999 would you be willing to mentor me towards implementing this?
I could start by submitting a PR that either gets it working or gets close, and then you could review and answer any questions that I might leave for you in the PR?
Then if this issue reaches a clear consensus on the #[no_mangle]
change being okay, we can merge that PR.
It seems like having
#[no_mangle]
make a#[repr(C)]
type count asused
is uncontroversial so far?
This is not the mindset I would use: repr(C)
should have no bearing here. #[no_mangle]
should make the function it is attached to effectively public[^1]. Everything else should fall out from that[^2] — since we cannot tell that the function is unused, then we cannot tell that the types used by the function are unused and they should not be reported by the lint.
[^1]: Now, should it actually count as truly public? I don't know. For example, a pub
function with a non-pub
return type triggers a private-in-public warning. Should marking the function as #[no_mangle]
instead of pub
also warn that the type is more public than the function?
[^2]: I actually wonder if this would allow us to roll back some of the earlier decisions about how repr(C)
/ repr(transparent)
interact with the dead_code
lint. That is, does treating #[no_mangle]
as a usage site allow us to remove the special-casing of repr
?
Now, should it actually count as truly public? I don't know. For example, a
pub
function with a non-pub
return type triggers a private-in-public warning. Should marking the function as#[no_mangle]
instead ofpub
also warn that the type is more public than the function? ↩
It does seem unfortunate to have #[no_mangle]
act as a separate way (than pub
visibility) to make something crate-public. If I were reviewing code that relied on such a function being crate-public without it being crate-public according to visibility, I'd find that surprising and want that changed. I'm struggling to think of a good reason to do this, and intuitively, I kind of want to lint against it.
Shepmaster I'm not sure that we're on the same page.
Are you suggesting that once a type is exposed by a #[no_mangle]
function, we should always
consider all of that type's fields as publicly accessible?
For instance, are you suggesting that the following should not produce a "field 0
is never read`" warning?
// This SHOULD warn that field 0 is never read
mod private_mod {
pub enum MyEnum {
Variant(Vec<u8>)
}
#[no_mangle]
pub fn new_bar() -> MyEnum {
MyEnum::Variant(vec![5])
}
}
I am suggesting that the above code snippet should still produce a warning (of some sort .. maybe a different one ..),
under the idea that there is no reason to believe that field 0
is read.
If it is being read it's because the user is depending on an unspecified memory layout. Maybe the warning would be about that. Not sure. I see this as a separate issue/topic that would require some design work.
The reason that #[repr(C)]
is relevant is that the following should not give a warning: field
0is never read
:
// This SHOULD NOT warn that field 0 is never read
mod private_mod {
#[repr(C)]
pub enum MyEnumC {
Variant(u32)
}
#[no_mangle]
pub extern "C" fn new_bar() -> MyEnumC {
MyEnumC::Variant(5)
}
}
I am suggesting that in this second code snippet we have given enough of a signal for the linter to treat the u32
field as alive, since it can be read/used by a C caller.
So, it should treat the u32
field as alive.
The reason that #[repr(C)]
is important is that it is a very strong signal that all of the fields are publicly accessible to a C caller.
When you combine that with a #[no_mangle] extern "C" fn some_function
we are essentially saying that the types fields are all readable by a C caller, and visible to a C caller, so all of the fields are alive.
I'm treating #[repr(C)] + #[no_mangle]
as a strong-enough signal of certain usage patterns that it should be a special case where the fields are obviously alive.
The other cases like having no repr at all, having #[repr(something_else)]
seem to draw more debate/considerations, hence me wanting to treat them separately.
Of course I'm a bit biased towards carving out a solution for #[repr(C)]
since it is the real papercut that I am facing in real code, but after seeing all of the discussion in some of the linked issues I do think it's reasonable to separate out the uncontroversial cases from the controversial ones so that some amount of progress can be made.
If you can point to where you disagree with the above then I think we'll be on the same page.
It seems like you've addressed something like this before, so feel free to link me to previous discussions and I'll happily read them.
would you be willing to mentor me towards implementing this?
@chinedufn thanks, I am not the member of compiler team so I don't have the access to approve. But I will have a look if you open such a PR.
I think #[no_mangle]
implies the function may be called by external code, but #[repr(C)]
implies another thing that the type will also be used by external code but not only constructed. Without #[repr(C)]
, the type may only be constructed externally and used in Rust crate, this allows us to lint such never-read fields.
So #[no_mangle] + #[repr(C)]
suppresses the lint for never-read fields looks good to me.
And for current dead code lint, #[no_mangle]
would mark the type live and then #[repr(C)]
would mark all the fields live. I think it's a regression and fixing it for private Enums may be good enough.
I'm not entirely sure, but it seems like https://github.com/rust-lang/rust/pull/128404 will close this issue.
Conversation that led to that PR -> https://github.com/rust-lang/rust/pull/127104#issuecomment-2258493827
I'm not entirely sure, but it seems like https://github.com/rust-lang/rust/pull/128404 will close this issue.
Conversation that led to that PR -> https://github.com/rust-lang/rust/pull/127104#issuecomment-2258493827
that's not true, those changes are not in 1.79
A repository that I maintain recently received a bug report about our proc macro generating a warning after they upgraded to
1.79.0
.I haven't bisected for the exact commit that introduces the warning, but I have been able to reproduce it in
1.79.0
.Minimal Reproduction
In the playground -> https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=3f5c81c0749130a6e46fbea14cbb055f
Potential Candidates
This comment links to some
rustc
commits that may have introduced the issue https://github.com/chinedufn/swift-bridge/pull/278#issuecomment-2179226690As mentioned, I haven't taken the time to bisect since I know that it's possible that a core maintainer already has a good sense of when/how this regression could have been introduced.