rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.98k stars 12.68k forks source link

Tracking issue for non_static_type_id #41875

Closed eddyb closed 4 years ago

eddyb commented 7 years ago

This is a tracking issue for the RFC "1849" (rust-lang/rfcs#1849).

Steps:

eddyb commented 7 years ago

Implementation instructions:

ghost commented 7 years ago

I'd like to participate on this :)

eddyb commented 7 years ago

@z1mvader Great! Don't forget to put "r? @eddyb" in your PR when you open it.

fcofdez commented 7 years ago

Is someone working on this issue? I would like to help with it.

ghost commented 7 years ago

Hi, I'm currently working on it. I'm working on the tests

kennytm commented 7 years ago

Could std::error::Error::{is, downcast_ref, downcast_mut} relax the 'static bound as well?

These currently require the 'static bound due to TypeId, and this makes it impossible to inspect the cause(), which just returns an Option<&Error> without 'static bound.

cc #35943

eddyb commented 7 years ago

@kennytm Those sound like they do the same thing as Any, in which case relaxing them is unsound (see comments on rust-lang/rfcs#1849 for some more details).

kennytm commented 7 years ago

@eddyb Thanks, I misread RFC. Sounds like doing this soundly will require &'a T and &'b T to encode to different TypeId, which is not possible because lifetime is not encoded in trans.

eddyb commented 7 years ago

Not exactly. Lifetimes are a bit... "infinite" in their multitude.

What you want (like I mentioned in the RFC comments) is a scheme to specify where each of N lifetimes go (e.g. Foo<'a> = &'a T for one lifetime), enforce that there are no other lifetimes (e.g. Foo<'static>: 'static), encode that in a TypeId (e.g. for<'a> fn(Foo<'a>) - note that this is 'static and the for<'a> is preserved), then have one Any-like trait for every number of lifetimes (e.g. out of Box<Foo<'a>> you create Box<Any1<'a>>) and a way to downcast by providing the constructor (Foo not Foo<'something>).

This scheme requires HKT/ATC or even messier generic traits with HRTB. The main downside is the manual impls necessary without some kind of type lambda.

Aceeri commented 7 years ago

@z1mvader Are you still working on this?

ghost commented 7 years ago

i thought @kennytm was. I no longer have time sorry :(

kennytm commented 7 years ago

@z1mvader o_O No I did not write any code on this, I was just making a comment in this thread πŸ˜…...

Aceeri commented 7 years ago

I'll be working on this tomorrow, just so anyone who is watching this issue knows.

Aceeri commented 7 years ago

Turns out this doesn't exactly help with my problem so I'm probably not going to be finishing my PR for this. Rustc also doesn't want to build properly on my machine, so I'd rather not waste too much time trying to fix it.

dtolnay commented 7 years ago

Here is a quick workaround for anyone waiting on this to be implemented.

#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Debug, Hash)]
pub struct TypeId {
    id: usize,
}

impl TypeId {
    pub fn of<T: ?Sized>() -> Self {
        TypeId {
            id: TypeId::of::<T> as usize,
        }
    }
}

Edit: Better:

use core::any::TypeId;
use core::marker::PhantomData;
use core::mem;

pub fn non_static_type_id<T: ?Sized>() -> TypeId {
    trait NonStaticAny {
        fn get_type_id(&self) -> TypeId where Self: 'static;
    }

    impl<T: ?Sized> NonStaticAny for PhantomData<T> {
        fn get_type_id(&self) -> TypeId where Self: 'static {
            TypeId::of::<T>()
        }
    }

    let phantom_data = PhantomData::<T>;
    NonStaticAny::get_type_id(unsafe {
        mem::transmute::<&dyn NonStaticAny, &(dyn NonStaticAny + 'static)>(&phantom_data)
    })
}
eddyb commented 7 years ago

@dtolnay How... is that guaranteed to work, at all? Not to mention that having TypeId work, instead of just the intrinsic, suggests potential misuse. Do you want it for something Any-like? In that case, you're likely building something unsound.

dtolnay commented 7 years ago

It worked great when I tried it :man_shrugging:. Here is the intrinsic version.

pub fn type_id<T: ?Sized>() -> u64 {
    type_id::<T> as usize as u64
}
kennytm commented 7 years ago

@dtolnay I don't see how that's gonna work besides causing infinite recursion πŸ˜„

eddyb commented 7 years ago

Note that it doesn't call, it casts a function pointer. But I expect mergefunc (-C opt-level=3 enables that optimization, if we still have it) might want to collapse all of those instances into one, and this clearly can't work cross-crate.

dtolnay commented 7 years ago

Seems to work with -C opt-level=3, and I don't need it cross-crate. :confetti_ball:

eddyb commented 7 years ago

@dtolnay Still, there is nothing guaranteeing this won't be optimized away - if mergefunc is enabled right now, I'd even consider its failure to collapse those instances as a bug. I consider it a bit irresponsible to suggest it to people who might write unsafe code off of it.

zackmdavis commented 7 years ago

Hm. I put together a patch for this, but when I actually run the tests, the "check that &'a instead of &'static doesn't change the type id" part fails ("does not fulfill the required lifetime"/"note: type must satisfy the static lifetime"β€”this despite the other test changes involving a type with an explicit non-static lifetime apparently succeeding). Doubtlessly I've misunderstood something.

eddyb commented 7 years ago

The point is to help use cases where the lifetimes simply do no matter. Given how rare this is I am tempted to ask to undo the RFC merge.

KodrAus commented 7 years ago

@zackmdavis The issue there is that we haven't changed the 'static requirement on the TypeId type (which we don't want to do), only on direct calls to the type_id intrinsic. So that test would need to call type_id::<ContainsRef<'a>>() instead of TypeId::of::<ContainsRef<'a>>().

@eddyb I'd like this to be available, but would be open to discussion around reverting so long as we had a clear path forward for getting some kind of type id where lifetimes are erased and/or considered. You've already described a possible scheme at the TypeId layer. I just don't really want us to take a complete step backwards.

zackmdavis commented 7 years ago

I see. But the lifetimes don't actually seem to be ignored? When I call the type_id instrinsic (rather than going through of) my test compiles, but fails.

diff --git a/src/test/run-pass/typeid-intrinsic.rs b/src/test/run-pass/typeid-intrinsic.rs
index 3becb8c..3fc2737 100644
--- a/src/test/run-pass/typeid-intrinsic.rs
+++ b/src/test/run-pass/typeid-intrinsic.rs
@@ -18,6 +18,7 @@ extern crate typeid_intrinsic_aux2 as other2;

 use std::hash::{SipHasher, Hasher, Hash};
 use std::any::TypeId;
+use std::intrinsics::type_id;

 struct A;
 struct Test;
@@ -98,10 +99,10 @@ pub fn main() {
     assert!(TypeId::of::<fn(fn(A) -> A) -> A>() !=
             TypeId::of::<fn(fn() -> A, A) -> A>());

-    // non-static lifetimes are OK but ignored
+    // non-static lifetimes are OK but ignored for the raw type_id intrinsic
     fn non_static<'a>(arg: &'a str) {
-        assert_eq!(TypeId::of::<ContainsRef<'a>>(),
-                   TypeId::of::<ContainsRef<'static>>());
+        assert_eq!(type_id::<ContainsRef<'a>>(),
+                   type_id::<ContainsRef<'static>>());
     }

fails with

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `TypeId { t: 10515927993715687232 }`,
 right: `TypeId { t: 3434146333182277806 }`', /home/ubuntu/rust/src/test/run-pass/typeid-intrinsic.rs:56:4
note: Run with `RUST_BACKTRACE=1` for a backtrace.
eddyb commented 7 years ago

Whoa, that looks bad. cc @michaelwoerister

michaelwoerister commented 7 years ago

Looks like a bug in the hashing code.

zackmdavis commented 7 years ago

Test still fails after rebasing on master including #43743 (zackmdavis/rust@8368e7ab2b).

------------------------------------------
stderr:
------------------------------------------
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `TypeId { t: 10515927993715687232 }`,
 right: `TypeId { t: 3434146333182277806 }`', /home/ubuntu/rust/src/test/run-pass/typeid-intrinsic.rs:56:4
note: Run with `RUST_BACKTRACE=1` for a backtrace.
leoyvens commented 6 years ago

Instead of reverting, we could do a more useful version of this that reveals whether type_id is from a static or not. For example by returning a Result<u64, u64> where Ok means it's static and Err means it's not static. This would allow a library to build a safe version of Any that still works only with static with types, but that can dynamically check whether the type is 'static, like this:

pub struct TypeId(u64);

impl TypeId {
    // No 'static bound, return is Option.
    pub const fn of<T: ?Sized>() -> Option<TypeId> {
        TypeId(unsafe { intrinsics::type_id::<T>().ok() })
    }
}

// No 'static bound, return is Option.
pub trait Any { fn get_type_id(&self) -> Option<TypeId>; }

// No 'static bound.
impl<T: ?Sized> Any for T {
    fn get_type_id(&self) -> Option<TypeId> { TypeId::of::<T>() }
}

impl Any {
    pub fn is<T: Any>(&self) -> bool {
        // Dynamically checks if `Self` and `T` are static and are the same type.
        match (TypeId::of::<T>(), self.get_type_id()) {
            (Some(t_id), Some(this_id)) => t_id == this_id,
            _ => false,
        }
    }
    pub fn downcast_ref<T: Any>(&self) -> Option<&T> { /* Same as std `Any` */ }
    pub fn downcast_mut<T: Any>(&mut self) -> Option<&mut T> { /* Same as std `Any` */ }
}
eddyb commented 6 years ago

@leodasvacas Rust is lifetime-parametric which means you can't "query" the presence of lifetimes, only require relationships that cause errors if unsatisfied. Preserving this is blocking the trait impl specialization feature. See also @nikomatsakis's blog post.

leoyvens commented 6 years ago

@eddyb You mean that can't be implemented because lifetimes are erased before monomorphization? Not even a syntatic check that would give the monomorphized type the information of whether it contains any lifetimes (and whether they are all 'static)? It's a coarse check but can cover most of the cases.

eddyb commented 6 years ago

@leodasvacas I'm not talking about implementation details, but rather language semantics.

It's unsound to do anything with lifetime parameters or lifetimes in type parameters in any way not guaranteed by bounds in scope (and therefore required at all use sites). We erase lifetimes because we have to be parametric, not the other way around. There have been bugs where we accidentally weren't erasing lifetimes correctly, for example.

As for "why", that's a tougher question, since a subset of safe Rust could potentially work fine with some sort of "concrete" model for lifetimes (e.g. runtime dependent (stack_depth, generation)), instead of a parametric one, but abstractions built on unsafe code generally can't. One such example of an abstraction relying on parametricity is the indexing crate.

brunoczim commented 6 years ago

Would it be feasible generating changing the code generation for functions which call this intrinsic? rustc generates a compiled version of some function for each version of its type parameter. Could we make the compiler also generate a compiled version also for each lifetime parameter version of some function ONLY in function which this intrinsic is used?

The first issue that comes in my mind is that a function f which calls function g would have to fall into this rule if g does.

eddyb commented 6 years ago

@brunoczim No, this is not feasible, nor sound, see https://github.com/rust-lang/rust/issues/41875#issuecomment-373702647.

Mark-Simulacrum commented 6 years ago

Unmarking as E-easy/mentor since it seems like this is blocked on something? @eddyb might know more, I'm not sure.

eddyb commented 6 years ago

@Mark-Simulacrum Note https://github.com/rust-lang/rust/issues/41875#issuecomment-319558222 followed by all the comments about unsoundness.

mikeyhew commented 6 years ago

From my understanding after reading this thread, the only thing that this is actually blocked on is the bug in the hashing code that is causing @zackmdavis's test to fail.

mikeyhew commented 6 years ago

I just rebased @zackmdavis' PR, and the assertion is still failing. Is there an issue open for the hashing bug?

rodrimati1992 commented 5 years ago

i have a use case for this,where I do runtime layout checking,and I already checked that the lifetimes in type definitions are the same (referencing the same lifetime parameters/'static). I need this to more efficiently identify types when recursively checking that the layout of a pair of types is the same,to check that I haven't visited the same pair of types already. The way I do it right now is with the address of the const LAYOUT:&'static TypeLayout that I pass in,but because this is not guaranteed to always be the same memory location for every instanciation of the constant,it can result in rechecking the same pair of types multiple times.

rodrimati1992 commented 5 years ago

Nevermind,I was able to find a solution in stable Rust,it involved requiring every type to specify a type StaticEquivalent:'static; associated type.

Avi-D-coder commented 5 years ago

Will TypeId::of() be marked as a const fn?

Mark-Simulacrum commented 4 years ago

This was discussed in last Thursday's lang team meeting, though briefly, and the general consensus in the room (we did not have full lang quorum, though) was that the tracking issue here should be closed. Since the RFC was accepted, several implementation PRs arose, but no concrete and sound use cases were given. Those PRs have been closed as a result, and @eddyb has commented in several places that they're not sure we should have accepted the RFC originally.

The (only?) use case suggested on the RFC thread is using it for storing the type of a struct in a type-erased context. I will admit that it is not entirely clear to me why this is permissible. I believe the hand-wavy explanation is that the lifetimes in question are always tied to the arena (and these types aren't observable without said arena), so essentially it's like 'static in that way.

The original RFC was accepted on the basis that we would only land a change to the intrinsic itself (which is unstable). This was meant as a guard-rail against uses, but since we've seen people suggesting unsound uses from time to time at a sufficient rate that we're concerned that removing the guard rails even that much would be unhelpful.

Furthermore, the (sound) use cases for this are extremely rare, and workarounds to the problem do exist β€” and those work on stable; see the motivation section of the RFC.

ping @joshtriplett for the actual FCP

Avi-D-coder commented 4 years ago

I currently using a workaround, the tuple of (drop ptr, size, alignment). This works, but is not as unique and is not a general purpose solution.

With TypeId an arena could contain a single type. In my GC this would enable giving out an iterator over all live arenas of a type. Since arenas do not strictly contain a single type, I cannot currently do this.

Avi-D-coder commented 4 years ago

It's very hard to give multiple use cases until the feature is implemented.

I'm unclear about the failing test on the implementation PR, but if it's not intended to pass and the implementation is basically complete, I'd advocate it land on nightly and we let people try it out, before it's axed.

joshtriplett commented 4 years ago

@rfcbot close

rfcbot commented 4 years ago

Team member @joshtriplett has proposed to close this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

bluss commented 4 years ago

The use case I'd have in mind is that: Given arbitrary T: 'a, be able to verify (type equality) that it is a particular 'static type, for example f64.

Here's some example code for that. If you excuse that I'm using dtolnay's type id hack as a stand-in for the real solution: (playground link)

The use case is: Given a generic &[T], dispatch to special cases for a few cases, for example when T is exactly f32 or f64, both of which are 'static.

This is sound because: type id should be able to positively identify a non-reference, 'static, type, among arbitrary types(?). This could also be solved by: specialization. This can be worked around by: restricting the inputs to 'static types and using Any. It would be cool if I was missing something and that this was achievable by other means, but I'm not sure.

Even here - yes, it would be wrong to compare &f64 and &'static f64 for "type equality", but that's not a needed feature in this use case.

Mark-Simulacrum commented 4 years ago

@bluss I agree that if we can support such a thing, I would be in favor of doing it (as it solves the vast majority of specialization cases I personally am aware of without actually using specialization). But I think that's a separate request -- maybe you can file an issue?

To be honest, I myself am not 100% clear on the details of whether what you suggest could be supported today (without just implementing specialization) but if it can be I'm all for it :)

Mark-Simulacrum commented 4 years ago

@joshtriplett can you cancel and re-start FCP? It looks like T-compiler was tagged here which isn't necessary for FCP

RalfJung commented 4 years ago

@bluss

The use case I'd have in mind is that: Given arbitrary T: 'a, be able to verify (type equality) that it is a particular 'static type, for example f64.

You have to (at least) restrict this further to types that don't have any lifetimes anywhere. With T = &'static i32, we do have T: 'static, but type_id(T) == type_id(&'a i32), so it would not properly tell T apart from some non-'static types.