rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.55k stars 12.61k forks source link

Tracking issue for future-incompatibility lint `tyvar_behind_raw_pointer` #46906

Open mikeyhew opened 6 years ago

mikeyhew commented 6 years ago

This is the summary issue for the tyvar_behind_raw_pointer future-compatibility warning and other related errors. The goal of this page is describe why this change was made and how you can fix code that is affected by it. It also provides a place to ask questions or register a complaint if you feel the change should not be made. For more information on the policy around future-compatibility warnings, see our breaking change policy guidelines.

What is the warning for?

This warning occurs when you call a method on a value whose type is a raw pointer to an unknown type (aka *const _ or *mut _), or a type that dereferences to one of these.

The most common case for this is casts, for example:

        let s = libc::getenv(k.as_ptr()) as *const _;
        s.is_null()

This can be fixed by giving a type annotation to the cast:

        let s = libc::getenv(k.as_ptr()) as *const libc::c_char;
        s.is_null()

The reason that this is going to become an error is because we are working on enabling arbitrary self types. Using that, you should be able to write functions such as:

#![feature(arbitrary_self_types)]

struct MyType;

impl MyType {
    fn is_null(self: *mut Self) -> bool {
        println!("?");
        true
    }
}

While that case is obviously silly, we can't prevent a sibling crate from implementing it, and such a function would make a call to s.is_null() when the type of s is an *mut _ ambiguous. Therefore, to avoid that potential breakage, you have to annotate the type of your raw pointer before the point of the method call.

After you fix these warnings, if you are working with raw pointers on nightly, you might want to check out #![feature(arbitrary_self_types)] yourself! It even works with trait objects:

#![feature(arbitrary_self_types, dyn_trait)]
trait Foo {
    fn example(self: *mut Self);
}

struct S;
impl Foo for S {
    fn example(self: *mut Self) {
        println!("Hello, I'm at {:?}", self);
    }
}

fn foo(foo: *mut dyn Foo) {
    foo.example();
}

fn main() {
    // This is 100% safe and not UB, even through I am calling `foo`
    // with a null pointer - this is a *raw* null pointer, and these are
    // always ok.
    foo(0 as *mut S);
}

While I'm at it, arbitrary_self_types also works for smart pointers, such as Rc<T> or Arc<T> (however, unfortunately we still have not figured out the best way to make it work with smart pointers to trait objects, so you can't yet create dyn Bar trait objects. We are planning on making it eventually work, so stay tuned!).

#![feature(arbitrary_self_types)]

use std::rc::Rc;

trait Bar {
    fn example(self: &Rc<Self>);
}

struct S;
impl Bar for S {
    fn example(self: &Rc<Self>) {
        println!("Hi I'm called on an Rc.");
        let _x = self.clone(); // You can call Rc::clone on Self!
    }
}

fn main() {
    Rc::new(S).example();
}

When will this warning become a hard error?

At the beginning of each 6-week release cycle, the Rust compiler team will review the set of outstanding future compatibility warnings and nominate some of them for Final Comment Period. Toward the end of the cycle, we will review any comments and make a final determination whether to convert the warning into a hard error or remove it entirely.

mikeyhew commented 6 years ago

@arielb1 thanks for editing with the detailed explanation!

SimonSapin commented 6 years ago

This warning occurs when you call a method on a value whose type is a raw pointer to an unknown type (aka *const _ or *mut _), or a type that dereferences to one of these.

(Emphasis added.)

I feel that this latter case is overly zealous. A pervasive pattern in Servo’s DOM can be simplified to this:

fn main() {
    // T = *mut _
    let mut constructor = RootedGuard::new(std::ptr::null_mut());

    // "warning: the type of this value must be known in this context"
    // But is this really ambiguous? Or does the warning have a false positive?
    let handle = constructor.handle_mut();

    // Type inference establishes `T = *mut JSObject` here.
    get_constructor_object_from_local_name(handle);
}

pub struct RootedGuard<T>(T);
pub struct MutableHandle<T>(T);
pub struct JSObject;

impl<T: Clone> RootedGuard<T> {
    pub fn new(x: T) -> Self { RootedGuard(x) }
    pub fn handle_mut(&mut self) -> MutableHandle<T> { unimplemented!() }
}

impl<T> std::ops::Deref for RootedGuard<T> {
    type Target = T;
    fn deref(&self) -> &T { &self.0 }
}

fn get_constructor_object_from_local_name(_: MutableHandle<*mut JSObject>) {}

As of 0a3761e63 the handle_mut call causes a tyvar_behind_raw_pointer warning which does not seem justified: handle_mut is an inherent method of RootedGuard<T> and so will always be picked before any method of T through auto-deref. arbitrary_self_types won’t change that as far as I can tell.

Please consider not warning in this case. This example is easy enough to fix (for example by replacing null_mut() with null_mut::<JSObject>()), but Servo has 1871 instances of this warning and adding as many type annotation would be a huge pain if they indeed don’t actually remove any ambiguity.

mikeyhew commented 6 years ago

@SimonSapin

handle_mut is an inherent method of RootedGuard and so will always be picked before any method of T through auto-deref

During method lookup, T is *mut _. That _ could theoretically resolve to a type that has a method fn handle_mut(self: &mut RootedGuard<*mut Self>), which would result an an ambiguity on method lookup. Not to say that defining methods like this is very useful, but that's what's currently allowed with arbitrary_self_types.

Type inference establishes `T = *mut JSObject` here.
get_constructor_object_from_local_name(handle);

It's unfortunate that the type information from this function call isn't available during the earlier method lookup. I don't know much about how type inference works in the compiler, but it would be nice if cases like this could be improved.

@arielb1 any thoughts on a possible solution here?

nikomatsakis commented 6 years ago

So, my thoughts are these:

There is really not much we can do to get more type information. We already do basically exhaust what remedies we have.

To get more, we'd have to do some big refactorings on the way that method lookup works. I'd like to do this, but it seems pretty distinct from this PR. We'd basically have to make method dispatch "delayable" until more type information is available.

I am a bit nervous about making this change however. I feel like we've seen a fair amount of code affected, though I don't have a real sample. Were any crater runs done to provide actual measurements?

At the worst, we could do the change only behind a feature-gate, and then use an epoch to enable access to custom self types.

mikeyhew commented 6 years ago

@nikomatsakis no, no crater runs were done. It would definitely be good to do one

nikomatsakis commented 6 years ago

@mikeyhew to that end, maybe you can prepare a PR that sets the "default level" for the lint to Deny, and then we can run a crater run?

mikeyhew commented 6 years ago

@nikomatsakis done, see https://github.com/rust-lang/rust/pull/47227

bluss commented 6 years ago

I've fixed this in 4 crates, (some of them before the crater run), so I'd rate it as something that comes up pretty often. It's important to prolong the warning period if it is so widespread.

In arrayvec and ndarray, the way to go is to replace casts with coercions.. It's a clear win for code quality because the coercion is much stricter about which conversions it allows. Like this:

-                let p = self.get_unchecked_mut(index) as *mut _;
+                let p: *mut _ = self.get_unchecked_mut(index);
                 ptr::copy(p, p.offset(1), len - index);
nikomatsakis commented 6 years ago

@bluss huh. I might rate that as a bug in terms of how casts are handled (I know why they are the way they are, but we might consider changing it)

I'm still having mixed feels about this change, although I do agree it makes for increased consistency. The crater run reports (thanks @mikeyhew!) definitely had a lot of affected code.

mikeyhew commented 6 years ago

@nikomatsakis (comment on the crater run PR)

This merits some discussion and consideration, I think. Of course, the future compatibility warning may lead some people to migrate, but we may want to think about some special method resolution rules -- for example, giving inherent methods defined on *mut T some sort of special precedence.

Giving the inherent pointer methods priority would improves things a lot. Is there anything obviously bad about doing that?

nikomatsakis commented 6 years ago

@mikeyhew

Giving the inherent pointer methods priority would improves things a lot. Is there anything obviously bad about doing that?

Not really. Just that it's an odd special case. We could also make this something that the --epoch switch controls.

nikomatsakis commented 6 years ago

Not that this switch exists yet =)

Manishearth commented 6 years ago

I kinda feel like this should go through epochs given that they exist precisely for this purpose anyway. Old epoch code cannot call methods with arbitrary receivers.

nikomatsakis commented 6 years ago

@Manishearth I think I agree. It's an interesting case though. We need to start doing some work on prototyping epochs, I think (e.g., adding a -Zepoch flag and epoch = 2018 thing for Cargo). I've been meaning to write up some kind of specific proposal there for some time.

Manishearth commented 6 years ago

I actually intended for that to be my next rustc side hacking project once im done with intra doc links and the clone thing. Might be able to get to it in February -- already have an idea of how the impl will go, but if you have thoughts lmk.

This specific bug might be useful as a testing flag for the epoch, i.e. make this warning a hard error in epoch 2018.

SSheldon commented 6 years ago

I've got a case that hits this lint without using any casting and can be minimized down to:

fn make_vec() -> Vec<i32> {
    let mut dest = Vec::with_capacity(1);
    unsafe {
        ptr::write(dest.as_mut_ptr().offset(0), 17);
        dest.set_len(1);
    }
    dest
}

The warning disappears if you don't call offset(0) of if you add an explicit type for dest.

It seems a little weird because apparently the compiler can infer that dest is a Vec<i32>, in which case as_mut_ptr must return *mut i32 and then offset must return *mut i32, so it doesn't seem like a raw pointer to an unknown type.

Is this an error in the lint?

arielb1 commented 6 years ago

@SSheldon

The compiler can't infer that dest is Vec<i32> (as opposed to Vec<_>).

Consider that someone could have implemented this function:

pub struct MyType;

impl MyType {
    pub fn offset(self: *mut Self, _offset: usize) -> *mut i32 {
        /* .. */
        1 as *mut i32
    }
}

That would obviously cause an ambiguity as dest could also be a Vec<MyType>.

Method lookup is conservative when it comes to adding inherent impls to unrelated types (as opposed to adding trait impls to existing traits), and therefore requires clarification.

vext01 commented 4 years ago

I've got a case that hits this lint without using any casting.

I've just encountered the same issue on today's nightly:

    /// Creates an empty vector with capacity for `cap` values.                 
    pub fn with_capacity(cap: usize) -> Self {                                  
        let raw = RawVec::with_capacity(cap);                                   
        for i in 0..raw.capacity() {                                            
            unsafe {                                                            
                ptr::write(raw.ptr().offset(i as isize), T::default());         
            }                                                                   
        }                                                                       
        DefaultVec { raw }                                                      
    }

Gives:

warning: type annotations needed
  --> src/lib.rs:26:38
   |
26 |                 ptr::write(raw.ptr().offset(i as isize), T::default());
   |                                      ^^^^^^
   |
   = note: `#[warn(tyvar_behind_raw_pointer)]` on by default
   = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in the 2018 edition!
   = note: for more information, see issue #46906 <https://github.com/rust-lang/rust/issues/46906>

As @arielb1 says, you can fix it by adding type annotations:

let raw: RawVec<T> = RawVec::with_capacity(cap);

But I think the error message was confusing. I'd have known what to do if it had said:

warning: type annotations needed
  --> src/lib.rs:23:..
   |
23 |                 let raw = RawVec::with_capacity(cap);
   |                     ^^^
   |
nikomatsakis commented 4 years ago

It's probably time to move towards closing this compatibility lint, and maybe we can improve the diagnostics at the same time.

orowith2os commented 11 months ago

It appears I've come across this issue, though I don't get why it's happening despite me providing type annotations:

#[repr(C)]
struct Foo {
    a: u8,
    b: u8,
    c: u8,
    d: u8,
}

fn main() {
    let var = Foo { a: 1, b: 2, c: 3, d: 4 };
    let raw_pointer_array = &var as *const Foo as *const u8;
    let goofy_array: [u8; 4] = unsafe { raw_pointer_array.cast().read() };
    println!("{:?}", goofy_array);
}
lcnr commented 11 months ago

it happens because the type of raw_pointer_array.cast() is *const _, i.e. a type variable behind a raw pointer. The lint is then emitted while selecting the method for read

orowith2os commented 11 months ago

It seems reasonable for it to infer from .read() that I want the type of .cast() to be *const [u8; 4]. If I'm not mistaken, this also happens in some similar capacity in type inference elsewhere too.