Open alexcrichton opened 5 years ago
Here's an unbaked thought: Can we make an extension unsafe trait ErrorTypeIdExt
to Error
, seal that extension trait (meaning that users cannot implement it), and then provide a blanket implementation for Error
?
I think my unbaked idea is not implementable in current Rust:
trait Error {
...
fn as_dyn_any(&self) -> &dyn Any where Self: 'static { self as _ }
fn as_mut_dyn_any(&mut self) -> &mut dyn Any where Self: 'static { self as _ }
}
The only problem is we can't add a "where Self:Sized" bound to the "{ self as _ }" part.
It’s tempting to make Any
a super-trait of Error
, and rely on Any::type_id
. This would be sound because Any
already has a blanket impl that covers every possible impl, so it cannot be overridden.
However Any
requires 'static
but Error
doesn’t (only its TypeId
-related methods do), so this plan doesn’t work as-is.
@SimonSapin Wasn't relating Any's bound discussed at some point?
On unstable we have #[marker]
traits which cannot override anything in their impls -- if they were allowed to define associated items with defaults in their trait definition, it would be another way to do this, though that considered too large a change to make with just a PR in https://github.com/rust-lang/rust/pull/53693#issuecomment-416384949.
I'm not sure why this functionality should be left to the library at all. Maybe we really need a type_id_val
paired to the existing intrinsics::type_id()
? (Similar to size_of
and size_of_val
.)
@skade Yes, this was discussed to say that it would be unsound: https://github.com/rust-lang/rust/issues/41875#issuecomment-304518580
@cuviper This sounds doable, but would require a TypeId
value to be stored in every trait object vtable. This has some code size cost. The size and alignment of the underlying type (and pointer to drop_in_place<U>
) are already special-cased like this, to allow Box<dyn Trait>
to exist for any trait and be dropped/deallocated correctly. Having the type_id
method be part of the Error
trait puts (a way to get) the TypeId
in vtables for dyn Error
today.
If we add fn type_id_of_val<T: ?Sized>(x: &T) -> TypeId
we’d also need to carefully define its semantics. With foo: &dyn SomeTrait
coerced from &U
where U: SomeTrait
, what’s interesting is here TypeId::of<T>()
. But dyn SomeTrait
is also a type, so TypeId::of<dyn SomeTrait>()
also exists. If type_id_of_val(foo)
returns the former, what should type_id_of_val(bar)
return when T: Sized
type? When T
is a [X]
slice?
If https://github.com/rust-lang/rfcs/pull/2580 is accepted and implemented we could have T: std::ptr::Pointee<Metadata = &'static VTable>
to restrict a type parameter to dyn SomeTrait
trait objects. Or more simply for this case, a type_id
method on VTable
(which in that RFC can be accessed through the std::ptr::metadata
function.)
@SimonSapin
This sounds doable, but would require a
TypeId
value to be stored in every trait object vtable.
Yeah, I see -- I guess I'm basically proposing the equivalent of C++ RTTI and typeid
keyword. It could be an indirect pointer to TypeId::of
, saving a little on 32-bit targets, but that's still some growth.
But
dyn SomeTrait
is also a type, soTypeId::of<dyn SomeTrait>()
also exists.
In the context of replacing Error::type_id
, would this use case have some double-dyn
indirection? Like a dyn Error
object which has the vtable of dyn Error
itself? (which eventually forwards to a sized Error
.) If so, then I think a type_id_of_val
could peel back just the first layer.
If
type_id_of_val(foo)
returns the former, what shouldtype_id_of_val(bar)
return whenT: Sized
type? WhenT
is a[X]
slice?
T: Sized
would return TypeId::of<T>()
, and [X]
slices would return TypeId::of<[X]>()
. It's not really the Sized
-ness that's of concern here, just the trait object indirection.
Double dyn Trait
indirection is possible but that’s not what my previous comment was about. For trait objects we want a way to peel back one layer as you say. This is as opposed to peeling back zero layer, which for example for &dyn Error
would give TypeId::of<dyn Error>()
. But if type_id_of_val
is a function defined for all types, do we want special-cased semantics of peeling back one layer for trait objects and zero layer for other types (which don’t have such layers in the first place)?
I think this inconsistency in the proposed semantics make this not a good design. For an operation that only makes sense on trait objects, I’d prefer to have a type signature that requires a trait object. Which goes back to https://github.com/rust-lang/rfcs/pull/2580 again.
Other idea: can we replace the 'static
bound in the definition of the Any
trait (pub trait Any: 'static { … }
) with where Self: 'static
bounds on its methods, and implement it for all types? That way we could add Any
as a super-trait of Error
and have Any::type_id
in the vtables for dyn Error
.
This might be a breaking change because generic code with a T: Any
bound could not anymore assume T: 'static
.
This might be a breaking change because generic code with a T: Any bound could not anymore assume T: 'static.
I believe that'd break code like this:
fn box_me<T: Debug + Any>(t: T) -> Box<dyn Debug> {
Box::new(t)
}
But if
type_id_of_val
is a function defined for all types, do we want special-cased semantics of peeling back one layer for trait objects and zero layer for other types (which don’t have such layers in the first place)?
Isn't that the exact same variability as size_of_val
? It peels back zero layers for sized types, or one layer otherwise -- or so it seems here:
use std::mem::size_of_val;
fn main() {
let mut range = 0..10;
println!("range: {}", size_of_val(&range));
let mut iter: &mut dyn Iterator<Item = i32> = &mut range;
println!("iter: {}", size_of_val(iter));
let iter2: &mut dyn Iterator<Item = i32> = &mut iter;
println!("iter2: {}", size_of_val(iter2));
}
range: 8
iter: 8
iter2: 16
Let’s take a foo: &dyn Error
coerced from &mpsc::RecvError
.
size_of_val
has a consistent definition: always “peels” exactly one layer of &_
indirection, and does not care about type-erasure indirection. dyn Error
is a dynamically-sized type: is has a size that can be computed at run-time. Just like [_]
is dynamically-sized type.
“Dynamically-sized” is a notion that exists in the language, so the definition of size_of_val(foo)
does not need to involve RecvError
at all.
In fact this is the only possible answer for “the size of the value behind the &
reference”. For type_id
however, TypeId::of<dyn Error>
and TypeId::of<RecvError>
are both results that exist, but the simple “ID of the type behind the &
reference” definition yields the former.
This might be a breaking change because generic code with a T: Any bound could not anymore assume T: 'static.
Unfortunately, this is hardly practical, from a quick Rust source code search, there is a lot code using T: Any
bounds (without explicit 'static
), and then does TypeId::of::<T>()
or calls methods of Any
. This isn't isolated to a single crate, there are a lot of crates doing this.
Would be cool if it could be done, but unfortunately it cannot be. I guess what can be done is deprecating std::any::Any
and creating a new version of it (maybe call it Unknown
, inspired by TypeScript here, which essentially created unknown
as a replacement of its any
type) that doesn't have a 'static
bound, but I don't know if that should be done.
I was wondering why Error::type_id
needs to be a trait method at all. Generally, if you want to keep a trait method from being overriden, you have to make it a function instead. But in this case, it turns out we do need it to be a trait method, because we want to take an &dyn Error
and get the type id of the erased Self
type. An implementation returning TypeId::of<dyn Error>()
would be useless in this regard.
It might seem like a bit of a stretch to add new syntax, but since final
is already a reserved word maybe we could use it to indicate that a default trait method cannot be overridden?
trait Error {
final fn type_id() -> TypeId {
TypeId::of<Self>()
}
}
There's a similar idea for #[no_override]
in https://github.com/rust-lang/rfcs/issues/291.
This might be a breaking change because generic code with a T: Any bound could not anymore assume T: 'static.
Unfortunately, this is hardly practical, from a quick Rust source code search, there is a lot code using
T: Any
bounds (without explicit'static
), and then doesTypeId::of::<T>()
or calls methods ofAny
. This isn't isolated to a single crate, there are a lot of crates doing this.Would be cool if it could be done, but unfortunately it cannot be. I guess what can be done is deprecating
std::any::Any
and creating a new version of it (maybe call itUnknown
, inspired by TypeScript here, which essentially createdunknown
as a replacement of itsany
type) that doesn't have a'static
bound, but I don't know if that should be done.
I think Identifiable
might be a better name, since it describes the operation it supports (type_id
).
It helped me to think about the reason for stabilizing this at all. There is already a stable way to get the TypeId
or any static type, via TypeId::of::<T>()
. The only time I can think of that you'd want the result of this method instead is if the type is behind a trait object, since then TypeId::of::<&dyn Error>()
would be wrong (or couldn't work if not static). If that's the only time, then here's my suggestion:
Error::__type_id()
. Keep it perma-unstable. If it really needed to be stabilized anyways, it could be made unsafe
.type_id()
to impl dyn Error
which calls self.__type_id()
. Users can call type_id
whenever they have some &dyn Error
.The soundness issue is present whether or not the API is stable or not - plenty of people develop against nightly, and they shouldn't be able to cause memory unsafety by using a safe API either.
Then the __type_id
method could be made unsafe
anyways. The only one ever calling it would be inside this same module.
An alternative to making it unsafe
but still possible to implement (on nightly) is to use some privacy tricks to make it impossible to implement the __type_id
method outside of the module:
use self::internal::Internal;
trait Error {
/// Since you can never import the `Internal` type, you can never override this method.
fn __type_id(&self, _: Internal) -> TypeId where Self: 'static {
TypeId::of::<Self>()
}
}
impl dyn Error + 'static {
fn type_id(&self) -> TypeId {
self.__type_id(Internal)
}
}
mod internal {
pub struct Internal;
}
Making type_id
unsafe won't work because that's the wrong direction of unsafety - an unsafe method requires the caller to uphold some invariant rather than the implementor.
The private type approach should definitely work, though!
I opened up a PR using @seanmonstar's private type approach to patch the soundness hole for now: #60902
It helped me to think about the reason for stabilizing this at all.
This one is partly on me. After we did FCP to stabilize in https://github.com/rust-lang/rust/issues/27745, I grepped for unstable attributes that pointed there. I didn’t pay attention to the #[doc(hidden)]
attribute on Error::get_type_id
.
Perhaps this method was never intended to be user-visible, but it’s hard to say since this was four years ago (https://github.com/rust-lang/rust/pull/24793).
given that https://github.com/rust-lang/rust/pull/60902#issuecomment-493480171 says
This sounds like a good stopgap approach for mitigating the impact on nightly. In the meantime discussion can continue on the issue for how to stabilize long-term.
should this not be reopened?
Yes I think it's fine to reopen this as a tracking issue for stability.
I wouldn't want to stabilize this with the private type hack it currently uses, so I think a blocker on stabilizing this API is one of these two things:
what about something like:
pub unsafe trait ErrorTypeId {
fn type_id(&self) -> TypeId where Self: 'static {
TypeId::of::<Self>()
}
}
unsafe impl<T: ?Sized> ErrorTypeId for T {}
pub trait Error: Debug + Display + ErrorTypeId {
// ...
}
@programmerjake I've slightly extended this by making it not specific to Error and wrote up an RFC. https://github.com/rust-lang/rfcs/pull/2738
@programmerjake I've slightly extended this by making it not specific to Error and wrote up an RFC. rust-lang/rfcs#2738
@leo60228 Thanks!
What's going on on this?
trait method impl
restrictions rust-lang/rfcs#3678 would allow stabilizing this with a nice interface:
pub impl(self) fn type_id(&self) -> TypeId {
...
}
Updated Issue
This is a tracking issue for stabilizing the functionality of
Error::type_id
somehow. The subject of a historical security advisory the API was recently changed to prevent memory unsafety issues on all channels including nightly. The functionality, however, is still unstable, so we should stabilize it at some point!Original issue.
Reported by @seanmonstar to the security mailing list recently, it was discovered that the recent stabilization of
Error::type_id
in Rust 1.34.0 is actually not memory safe. Described in a recent security announcement the stabilization ofError::type_id
has been reverted for stable, beta, and master.This leaves us, however, with the question of what to do about this API?
Error::type_id
has been present since the inception of theError
trait, all the way back to 1.0.0. It's unstable, however, and is pretty rare as well to have a manual implementation of thetype_id
function. Despite this we would ideally still like a path to stability which includes safety at some point.This tracking issue is intended to serve as a location to discuss this issue and determine the best way forward to fully removing
Error::type_id
(so even nightly users are not affected by this memory safety issue) and having a stable mechanism for the functionality.