Closed jkotas closed 4 years ago
@mellinoe @nietras @mikedn @VSadov @KrzysFR @terrajobst
I propose changing the API surface to:
public static class Unsafe
{
public static ref U As<T, U>(ref T source);
public static ref T Add<T>(ref T source, int elementOffset);
public static bool Equals<T>(ref T a, ref T b);
}
My initial thoughts are centered around two things:
As far as I can tell these new API additions are there for convenience only, since they can be expressed via the excisting unsafe API surface.
ref int r = ref Unsafe.AsRef<byte, int>(ref b[0]);
can be expressed via:
ref int r = Unsafe.AsRef<int>(Unsafe.AsPointer(ref b[0]));
Note how the existing API does not need to specify source type byte
.
ref int r1 = ref Unsafe.RefAdd(ref a[0], 1);
can be expressed via:
ref int r = Unsafe.AsRef<int>(Unsafe.AsPointer(ref a[0]) + Unsafe.SizeOf<int>() * 1);
Clearly, RefAdd
is shorter. Although, naming is somewhat unclear, see below.
Unsafe.RefEquals(ref a[0], ref a[0]);
can be expressed via:
Unsafe.AsPointer(ref a[0]) == Unsafe.AsPointer(ref a[0]);
Personally, I think these additions are worth making as it makes a lot of scenarios easier and
they will allow using these outside an unsafe
context which I assume is a goal here. Although,
I would suggest different naming...
What strock me first is that Ref
is superfluous, with ref
we are in a "type safe" context so there really is no need for Ref
in the naming.
That is, the API can be expressed instead simply as:
public static class Unsafe
{
public static ref U As<T, U>(ref T source);
public static ref T Add<T>(ref T source, int elementOffset); // Note I changed offset to elementOffset to make it clear
public static bool Equals<T>(ref T a, ref T b);
}
This will allow writing the following:
ref int r = ref Unsafe.AsRef<byte, int>(ref b[0]);
ref int a1 = ref Unsafe.RefAdd(ref a[0], 1);
var sameAddress = Unsafe.RefEquals(ref a[0], ref a[0]);
instead as:
ref int r = ref Unsafe.As<byte, int>(ref b[0]);
ref int a1 = ref Unsafe.Add(ref a[0], 1);
var sameAddress = Unsafe.Equals(ref a[0], ref a[0]);
Since ref
keywords are littered all over, Ref
in the method names is redundant in my view. As @mikedn commented RefAdd
/Add
it is not particularly clear whether the offset is in bytes or elements. Other, names for Add
, could be Index
, At
, Offset
, AddOffset
etc.
I think Add
is probably best in terms of succintness and intend, but the parameter should be named explicitly to clearly indicate the offset is in elements.
As far as I can tell these new API additions are there for convenience only, since they can be expressed via the excisting unsafe API surface.
Nope, they cannot. The equivalents that you show are incorrect because the use of AsPointer
introduces an intermediary unmanaged pointer.
public static ref U As<T, U>(ref T source);
We already have a ref
returning method - AsRef
. It seems to me that it would make sense that the new method is also called AsRef
.
public static bool Equals<T>(ref T a, ref T b);
Seems confusing. I can imagine one asking "Does this method compare references or does it compare the referenced values?".
I think the biggest problem with this API is that AsRef
requires to specify both the source and the destination types. Unfortunately we'll have to live with that as there's no way to avoid this given the current language possibilities.
That said, I wonder if it wouldn't be better to reverse the generic arguments:
public static ref U AsRef<U, T>(ref T source);
in the hope that a future language version could do some sort of partial type inference so AsRef<int>(ref floatVar)
is treated as AsRef<int, float>(ref floatVar)
.
Nope, they cannot. The equivalents that you show are incorrect because the use of AsPointer introduces an intermediary unmanaged pointer.
Ah yes, I made the incorrect assumption that things would be fixed
, which they don't need to be. Nevermind the necessity argument then.
We already have a ref returning method - AsRef. It seems to me that it would make sense that the new method is also called AsRef.
For me, the Ref
part of the existing AsRef
implies a change in "reference type" i.e. from pointer to ref. Not that this is working on ref values. Since the AsRef<T,U>
does not change from a pointer to a ref or similar it simply changes the type from T
to U
.
What if in the future generic pointers are allowed, but changing type of the pointer is not allowed via casting for example, would the following API addition make sense?
public static U* As<T, U>(T* source);
Would this then have to be called this AsPointer
? To me this seems wrong. To me As<T,U>
is in a closer relationship to As<T>
than AsRef<T>
.
Seems confusing. I can imagine one asking "Does this method compare references or does it compare the referenced values?"
Yes, I agree with that. Not sure RefEquals
is the best name then, though. Why not use existing name found on object i.e. ReferenceEquals
or would that also be confusing?
public static ref U AsRef<U, T>(ref T source);
This just seems counterintuitive to all other conventions in .NET. Perhaps, the compiler might as well infer it from the target e.g.
ref byte b = ref a[0];
ref int i = Unsafe.As(ref b); // Infer from assignment? Although it is almost too magical.
I suggested a fluent API for something like this when we first discussed the Unsafe
API but the consensus was this would be bad from a perf perspective, since it introduces an intermediate "closure" type (although it should be JIT'ed away).
The other two IL operations on refs that I have seen used are:
I do not claim that these are good names, but functionality could be useful.
@vsadov good suggestions. Out of interest, would you elaborate on what IsOnStack
(or just OnStack
) is used or could be used for?
This just seems counterintuitive to all other conventions in .NET. Perhaps, the compiler might as well infer it from the target e.g.
Yeah, I suppose it doesn't make sense to do that in the hope that C# will ever do partial type inference like C++ does.
Why not use existing name found on object i.e. ReferenceEquals or would that also be confusing?
Hmm, I suppose it's fine. Whatever name suggests to the user that the references are compared and not the values :smile:.
IsOnStack - used mostly in asserts
To add to @nietras question: how would this be implemented?!
Thank you for a great feedback! I like the suggestions.
IsOnStack - used mostly in asserts
Byrefs on stack are implicitly pinned, so it can be used to assert that it is safe to convert to raw pointer without pinning - example from CoreCLR. Unfortunately, there is no way to implement it in portable way. It would have to be runtime or platform specific that is not pretty given the current shape of library. If it keeps showing up as needed API, it should be looked into as separate issue.
RefSubtract - atomic subtraction between two refs useful to get distance between refs to elements of the same array
I agree that it is useful operation to have. BTW: It is less useful with the current byref locals and returns than one may think because of the single assignment limitations of byref locals. I have noticed in my experiments that one tends to operate on indices and then convert to byref as the last step and never go back - the style of the pointer math is different from unmanaged pointers.
It may be be also nice to have Subtract variant that takes elementOffset for convenience and symmetry with Add.
So the updated proposal is:
public static class Unsafe
{
public static ref U As<T, U>(ref T source);
public static ref T Add<T>(ref T source, int elementOffset);
public static ref T Subtract<T>(ref T source, int elementOffset);
public static int Subtract<T>(ref T a, ref T b);
public static bool ReferenceEquals<T>(ref T a, ref T b);
}
More suggestions for refinements are welcomed.
Including @jamesqo @KrzysztofCwalina that I forgot to include yesterday.
public static int Subtract
(ref T a, ref T b);
Since we're taking 64-bit platforms into account, the return type should probably be long instead.
public static ref U As<T, U>(ref T source);
This overload is bothering me a little. It doesn't read very smoothly in my brain; if I saw something like var y = Unsafe.As<Foo, Bar>(ref x)
in code, I would go 'OK, so this is converting something to a Foo and... what's that second type parameter doing there?', when in fact it was converting to a ref Bar and the first type is what's being copied from. The fact that the existing As
overload has only 1 parameter only makes things more confusing, for example if As<int>
returns an int I would expect As<int, ...>
to also return something related to an int.
I think we should instead name this ConvertRef
, which reads better and makes it clearer we're dealing with refs. e.g.
Unsafe.ConvertRef<Foo, Bar>(ref x); // convert *from* ref Foo *to* ref Bar
It would also work smoothly if a future version of C# adds type inference based on return type assignment, e.g. as mentioned above
ref byte b = ref a[0];
ref int i = Unsafe.ConvertRef(ref b); // We converted a ref, and now we have a ref int
ref int i = Unsafe.As(ref b); // As... what? And why is the parameter passed as ref? :(
Your thoughts @nietras @mikedn?
return type should probably be long instead
If this is used on array elements, the 32-bit return type is sufficient because of array indices can only be 32-bit integers in mainstream .NET runtimes. The problem only exists if somebody uses it for a general pointer math on unmanaged pointers cast to refs, or on .NET runtime variants that allow arrays with >2MB elements. A similar problem exists for Add as well. 32-bit offset argument is not ideal for general pointer math on 64-bit platforms. Changing the offsets to 64-bit type would make it lower performance on 32-bit platforms, and harder to work with. Another option is to make the offsets native int (IntPtr in C#), but it would again make it harder to work with because of one cannot do much with IntPtr in C# directly today.
I think the best way out is to document that these operations only work well on arrays with <2MB elements.
Unsafe.ConvertRef
Convert suggests significant change of representation in my mind, e.g. going to from bool to string. I agree with @nietras observation that it should called As
because of it is just like the existing casting method. The need to specify extra generic argument is unfortunate, but there is not much that can be done about it without language change.
@jkotas Agree with your points on Subtract
, it should probably be documented that it won't work well for pointers whose difference can't fit in an int. Users who really need to get a long difference can just convert both refs to byte*
s and take the difference of that / Unsafe.SizeOf<T>
.
I agree with @nietras observation that it should called As because of it is just like the existing casting method.
Unfortunate that we didn't name the existing overload Cast
instead of As
earlier, then we could have CastRef
and CastPointer
with no confusion between the AsRef
/ AsPointer
methods. 😞
If we are to keep it as As
, I would at least suggest switching the type parameters. I know @nietras said earlier it was 'counterintuitive to all other conventions in .NET', but every existing method in the Unsafe class takes the destination parameter before the source parameter. I think we should keep it that way for symmetry.
Also regarding the other Add
and Subtract
overloads, maybe they should accept uint
instead of a regular int? This will prevent people from writing redundant code like Add(ref source, -6)
or (God forbid) Subtract(ref source, -10)
. Or perhaps the other Subtract
could be omitted entirely, since Add
with int is enough to express up to 2 ^ 31 elements in both directions (barring Subtract(ref source, int.MinValue)
, but I don't think that's going to be a common case).
perhaps the other Subtract could be omitted entirely
I think ref T Subtract<T>(ref T source, int elementOffset)
should be omitted too, it seems redundant. Instead we could consider adding a second overload for Add
which takes IntPtr
. This still leaves the Subtract(ref, ref)
without a method suitable for 64-bit. I do not think that is the biggest problem with Subtract(ref,ref)
, though. Does it return element offset or byte offset? There is no way to tell from the method signature. I believe it should be analogous to Add
and return element offset, will that be the case?
I agree overall that the common use case is probably gonna be with 32-bit offsets and that these will be signed integers.
ref TTo As<TFrom, TTo>(ref TFrom source)
vsref TTo Convert<TFrom, TTo>(ref TFrom source)
@jamesqo I do agree that the readability of As
in this case is not ideal. Perhaps there is another way, though. I remember we had the same discussion for pointers initially, that is we discussed something like TTo* Cast<TFrom, TTo>(TFrom* source)
, but this was quickly resolved since implicit conversions from T*
to void*
is allowed.
Should this not be allowed for ref
s as well. Can ref T
not be implicitly converted to ref void
? As far as I understand C# allows implicit conversions that "loose" type information, that is from class T
to object
and from T*
to void*
etc. Would it then not make sense to allow ref T
to ref void
? That is it should be possible to write:
ref int i = ref a[3];
ref void v = ref i; // Implicit conversion OK
I do not know if there is any reason for ref void
not being allowed at all currently (compile fails in C# with CS1536 "Invalid parameter type void", C++/CLI shows "a reference to void not allowed" intellisense warning), other than it makes sense since we cannot assign a value to it. However, with ref locals and returns would ref void
not make sense? Of course, it cannot be used for anything as such other than to hold an address, but is that not a valid purpose?
If ref void
would be allowed, we could define the conversion simply as:
ref T As<T>(ref void source)
And be able to write:
ref int i = ref a[3]
ref byte b = Unsafe.As<byte>(ref i);
I do understand this requires language design changes not only for C# but also for C++/CLI, VB.NET and F# perhaps, but I assume this is needed in some way for ref locals and returns anyway. So is allowing ref void
a possibility? Are there any problems with this that I have not thought of?
Would it then make sense to allow something like:
void* p = ....;
ref void v = Unsafe.AsRef<void>(p);
Would it then not make sense to allow ref T to ref void?
I think it would make sense. And you can actually create a ref void
in IL but you won't be able to call the method from C# exactly because it can't convert from, say, ref int
to ref void
.
The funny thing about this is that if the language allows such a conversion then it should probably allow many other conversion between ref types and that may make this particular As
overload mostly useless. Granted, such conversions would be explicit and that would require additional design and implementation work.
ref void v = Unsafe.AsRef
(p);
That wouldn't work unless the language is changed to allow void to be used as type argument. I think there were some discussions about that but nothing happened.
allow void to be used as type argument
Yes, it is a problem in making C# more functional. void
as a proper type is very useful e.g. return type.
may make this particular
As
overload mostly useless
Not sure this would be true, with pointers there is still the issue that generic casts are not allowed. I would assume this would be the same for possible ref
conversions. That is, an explicit one like (ref byte)ref i
where ref int i
will work for example, but in a generic context you cannot write (ref T)ref i
or the equivalent for pointers (T*)p
. This is why Unsafe
is so useful, we can circumvent the restrictions imposed by C# (restrictions that are there for a good reason, usually ;))
Although, this is mainly due to the restriction of C# not supporting generic pointers, if generic ref
s are supported and casting between them is allowed in generic code, then yes the As
would be useless, I think.
but in a generic context you cannot write (ref T)ref i or the equivalent for pointers (T*)p
I don't see any reason why (ref T)ref i
would not work in a generic context. Unlike (T)i
ref conversions are no-op so they don't suffer from the usual generic conversion problems. That said, such conversions are inherently unsafe and would probably require an unsafe context. Probably it's best to leave them out of the language because of that.
At least conversions to ref void
are safe. You can't do anything with the resulting reference except converting it back to some ref X
and that needs to be done via Unsafe.As
.
@nietras @mikedn Maybe we could go back to the earlier suggestion using a fluent API? It could look something like this (translated to IL of course):
public static Interpreter<T> Interpret<T>(ref T source) => new Interpreter<T>(ref source);
public struct Interpreter<T>
{
private IntPtr _ptr;
public Interpreter(ref T source)
{
_ptr = (IntPtr)source;
}
public ref U As<U>() => (ref U)_ptr;
}
That way we could use it like
ref int i = Unsafe.Interpret(ref b).As<int>();
@nietras mentioned earlier this could be 'bad from a perf perspective', but since the JIT should basically eliminate these copies in Release mode I don't see why not (even if it's a little more IL to write). It's able to do partial type inference and looks much better than the other prototype which requires to to specify both types.
Maybe we could go back to the earlier suggestion using a fluent API?
Ha ha, no thanks. I think I'll start hating fluent APIs with a passion. They have their uses but these days they're more like abuses.
@mikedn OK then, I'm going back to my earlier position about switching the type parameters. :)
Regarding the ref void
/ builtin ref cast discussion, I agree with you it's probably unlikely that those featues will be added anytime soon-- ref
is used all the time for 'normal', non-unsafe code, e.g. Array.Resize
and Monitor.Exit
. Type safety would likely be more of a concern in that area.
Unsafe.Interpret(ref b).As
();
The implementation you have suggested would not even work. It has GC hole because of intermediate unmanaged pointer.
switching the type parameters
The prevalent order in the .NET APIs is "source, destination". The Copy methods on unsafe class are intentionally violating it to be in sync with the low-level order used in C and IL (discussion in dotnet/corefx#7966). Hard to come up with the "right" answer.
The different order between .NET and C is endless source of mistakes when one is using both. E.g. @omariom @GSPP just fell into this trap in https://github.com/dotnet/coreclr/issues/6541.
@KrzysztofCwalina
btw, CoreFxLab's Span
has Cast
method.
It is basically As
but for slices.
I think single name should be selected for both because it is just scalar vs sequence.
imo, As
better expresses "the same location , different interpretation".
public static bool ReferenceEquals
(ref T a, ref T b);
string s1 = "foo";
string s2 = s1;
Unsafe.ReferenceEquals(ref s1, ref s2);
// vs
object.ReferenceEquals(s1, s2);
It confuses me. ref
s are not references.
May be better to keep RefEquals
? It aligns well with AsRef
.
Regarding int elementOffset
, I think IntPtr
is appropriate to be forward compatible with future runtimes. Somewhere in the next 10 years we will likely need mainstream support for large arrays since big-memory scenarios are becoming gradually more common.
IntPtr
is the the "correct" type anyway. For such low level code that's probably alright from a convenience standpoint (lowered convenience is OK).
I don't think there should be int
overloads at all. That gets awkward because there is no overloading on return type.
I'd split Subtract
into GetElementDifference
and GetByteDifference
. That clarifies the meaning and provides a new, useful method as well.
ReferenceEquals
should not have plural Equals
. It should be ReferencesEqual
. I'd call it AreReferencesEqual
.
As
is a very unspecific name. It also collides with the C# keyword as
which means something else entirely. I'd call it UncheckedCast
or Cast
or ConvertReference
.
@jkotas
It has GC hole because of intermediate unmanaged pointer.
Ah, you seem to be right. I don't know if IL allows you to store a T&
as a field...
The Copy methods on unsafe class are intentionally violating it to be in sync with the low-level order used in C and IL (discussion in dotnet/corefx#7966). Hard to come up with the "right" answer.
Actually, in the end I don't think the order of the type parameters really matter; when people start typing the method into VSCode / Visual Studio, they should see the name of the type parameter pop up (e.g. TSource
, TDestination
) which should prevent confusion. Even if they still slip up the compiler will catch it for them (unlike the methods w/ regular parameters), since you can't implicitly convert a ref byte
, say, to a ref int
. I think whatever decision is made should be the best one for readability, and As<TDestination, TSource>
seems to be better in that regard (assuming we don't change it to Cast
/ CastRef
).
@omariom
imo, As better expresses "the same location , different interpretation".
If someone wrote
enumerable.As<int>();
I would think that they were somehow converting an IEnumerable to an int, whereas if someone wrote
enumerable.Cast<int>();
I think I'd better understand each element of the enumerable was being cast, although maybe that's just because it's the API we have today.
It confuses me. refs are not references.
That's actually a good point; I agree too that calling it RefEquals
may be a good idea.
May be better to keep RefEquals? It aligns well with AsRef.
AsRef
was shelved earlier since there's an existing overload AsRef
that converts from pointer-to-ref. I'm still kinda hoping that the new method can be named something like CastRef
rather than As
though... :confused:
@GSPP
I think IntPtr is appropriate to be forward compatible with future runtimes.
I'm not too sure about the idea of returning/accepting an IntPtr
, first of all as @jkotas mentioned there's not a lot you can do with it (e.g. multiply, divide are missing), it seems to imply that it points to a valid memory location when in fact it's just a number, etc. Plus, the pointer size is not always guaranteed to be the same as the pointer difference size for a given platform; C for example differentiates between size_t
, ptrdiff_t
, intptr_t
, etc. and there are real cases where they differ.
I'd split Subtract into GetElementDifference and GetByteDifference.
Redundant, you can just do this for the byte difference:
var byteDifference = Unsafe.Subtract(ref a, ref b) * Unsafe.SizeOf<T>();
Guaranteed to not overflow (I believe) since the maximum number of bytes 2 pointers can be apart is 2^63 / 64 or somewhere around there. I also think the redundant div (in Subtract)/mul (in SizeOf) should be eliminated by the JIT. If it's not, it should be.
Maybe if the method names were a little shorter then it would be OK... SubtractElements
, SubtractBytes
maybe? Seems kinda verbose though.
@nietras Regarding Subtract, even though it's redundant (I said so myself earlier) I still think it may be worth including. Unsafe.Subtract(ref a, 6)
is more readable than Unsafe.Add(ref a, -6)
, and C# doesn't force you to write ptr + -6
. Plus, we already have another overload of Subtract.
Lots of great input and food for thought. Naming is hard and my comment got pretty long again :|
As, Convert, Cast
etc. naming
All these have existing meanings in .NET, none are without prior bagage.
As
- already in use via as
. A "referential" cast if possible or returns null
. Only reference types are supported.
Convert
- used throughout .NET to indicate conversion from one type to another, in most cases (in my view) with copying of state e.g. TypeConverter.ConvertTo
, BitConverter
etc.
Cast
- or ()
used as both referential cast (e.g. (string)obj
) and value casting (e.g. (int)float
), in all cases a cast is checked and will fail if not appropriate.
I lack a proper terminology definition for .NET for talking about these in a consistent manner (does any exist?) so hope it is clear from context. Other possible wordings could be Reinterpret
, Change
and of course one can add suffix, prefix or other "fix"'es to these e.g. Ref
, Reference
, Unchecked
all of which seem superfluous or redundant given the context Unsafe
and how code would look as I previously mentioned, code will be littered with ref
s.
For the first version of Unsafe
we chose As
. In my opinion because As
for Unsafe
is closest in relationship to the as
keyword, we unsafely reinterpret a pointer, object, ref as something else, and because of its brevity and readability, see below. All of these read naturally and are succinct, we are reinterpreting a value or object as a different type, we are not doing actual type conversions or value castings.
T As<T>(object o)
void* AsPointer<T>(ref T value)
ref T AsRef<T>(void* source)
That is why I still do not like Convert
although it admittedly reads better i.e.
ref byte b = ...;
ref int i = Unsafe.As<byte, int>(ref b);
ref int i = Unsafe.Convert<byte, int>(ref b); //
Yes Convert
reads more natural, but is it converting the value as well? Leading to add Ref
to make it more clear although there are ref
s all over. So it gets long.
Cast
has the same issues. @jamesqo even gives an example of it with enumerable.Cast<int>()
which does value casting, not referential casting so cast has too much bagage for me in C#.
I am sure there are inconsistencies in my argumentation here :) However, for me I would much rather stick with the existing verb As
which has a clearer meaning in my view, even though it is not perfect ;)
I would then define the method as (if ref void
is not possible):
ref TTo As<TFrom, TTo>(ref TFrom source)
which leads to the other suggestions for type parameter names and order. @jamesqo suggested TSource
, TDestination
good suggestions but they are too long in my view, TFrom
, TTo
are better just due to brevity.
For the order of parameters I think one has to look to Func<>
a type used all over .NET and used by most people. This has TResult
last. This alone is reason enough for me to have it last for As
as well, since I think it would be counterintuitive for new users. I do agree, though, that we definitely need the type parameters to be explicitly worded as T, U
does not give enough meaning.
Wouldn't it be pertinent to ask the Roslyn team what there thoughts are on ref void
? They surely have thought about this and it seems like a good addition.
ReferenceEquals
,RefEquals
,AreRefsEqual
,AreEqual
etc.
I agree ReferenceEquals
is perhaps not the best choice anyway, as @omariom pointed out, "ref
s are not references" which is pretty to do the point, which is why I miss a proper defined terminology. I could live with AreEqual
since it is short and with ref
in the code and under Unsafe
the usage should be clear, but a good alternative is:
bool AreRefsEqual(ref T a, ref T b)
Subtract
,Offset
,Difference
,Distance
,Index
forref
toref
etc.
I think all "iterator" operations (can't help to feel that we are pretty much implementing C++ iterator behaviour for ref
s, so perhaps inspiration can be found there? std
uses distance
as indicative of number of elements between first and last) should return element offsets. If byte offset is needed use SizeOf
.
I think Offset
works better than Subtract
, see next section.
Whether the offset should be IntPtr
or not I am not sure, but it is definitely a problem that IntPtr
does not support arithmetic on it. A sore point for C# in my view, there is no "native" integer type... an oversight in my view.
Subtract(ref, int)
I can live with adding this as well as Add
, but this would give stronger support into not naming Subtract(ref, ref)
well... Subtract
but instead DistanceTo
, Offset
or similar. Currently, I prefer Offset
as it is short, and we constantly keep saying element offset, when talking about add and subtract so should the offset between two ref
s not be found with the Offset
method? Alternatively OffsetBetween
.
Maybe just add * and / to IntPtr? I think IntPtr should behave like any other integer type as much as possible. I saw proposed C# language changes about that on the Roslyn Github presence.
The pointer difference representation can safely be IntPtr on all platforms. The CLR can just promise to make that work. I see no issues with that.
public static ref U AsRef<T, U>(ref T source);
Is same format used for Vector reinterpret
public static Vector<Byte> AsVectorByte<T>(Vector<T> value) where T : struct
public static Vector<Single> AsVectorSingle<T>(Vector<T> value) where T : struct
Maybe just add * and / to IntPtr? I think IntPtr should behave like any other integer type as much as possible.
https://github.com/dotnet/corefx/issues/10457 Operators should be exposed for System.IntPtr
and System.UIntPtr
@jkotas considering that stack always grows towards the heap and on the vast majority of current systems stack grows downwards, I think IsOnStack could be pretty portable.
IsOnStack could just take a ref of a dummy local and pointer-compare with the given ref. If the given ref points to a higher location, its referent cannot be on the heap.
We can ignore cases where the given ref points to a stack frame of a different thread or to a dead frame in the current stack, or to a kernel mode segment. Having such refs is a bug by itself and IsOnStack would not make much sense in those scenarios.
Note that with two dummy locals in two frames the direction of stack can be detected dynamically and the whole thing could be made insensitive to the "downwards" part and would only require that stack continuously grows towards the heap. However, I think it is an overkill and "downwards" is a safe assumption.
considering that stack always grows towards the heap
Stacks of different threads can be interleaved with GC heap segments. It is actually pretty common to have this situation in large workloads (on Windows at least).
Stacks of different threads can be interleaved with GC heap segment
I did not know about this. I always assumed that OS allocates stack segments in the higher addresses separately from heaps. I think I might have seen code in the past that relies on such assumptions. Interesting...
Here is the updated proposal with feedback incorporated:
public static class Unsafe
{
public static ref TTo As<TFrom,TTo>(ref TFrom source);
public static ref T Add<T>(ref T source, int elementOffset);
public static ref T Subtract<T>(ref T source, int elementOffset);
public static bool AreSame<T>(ref T a, ref T b);
}
<TFrom,TTo>
order for As
as it is preferred by more people. Renaming the generic arguments for clarity.int
overloads for Add/Subtract are needed to make this reasonably usable today. IntPtr
overloads for Add/Subtract can be added later without any harm if/once native int becomes better supported in C#. Subtract(ref,int)
for convenience, even though it is redundant.Subtract(ref,ref)
for now because of it is not very useful with byref locals and returns anyway, and there are naming and design issues around it.ReferenceEquals
to AreSame
. I have checked about a good name with @KrzysztofCwalina and he suggested this name. It is being used for similar concepts in other places and I like it the most out of all the options discussed.@nietras
Wouldn't it be pertinent to ask the Roslyn team what there thoughts are on ref void? They surely have thought about this and it seems like a good addition.
You can probably ask, but I am 99% sure the answer will be no; I don't think they're going to be very keen on adding further unsafe features to C#, e.g. they chose ref returns over generic pointers. Besides this contrived use case, what other uses could ref void
possibly have in safe code? void*
is only useful (mostly) since we don't have generic pointers.
ReferenceEquals, RefEquals, AreRefsEqual, AreEqual etc.
I think RefEquals
may be best here. If Object has ReferenceEquals
, we should have RefEquals
. If Object had AreReferencesEqual
, we should have AreRefsEqual
.
Subtract, Offset, Difference, Distance, Index for ref to ref etc.
I think Difference
is best. C# allows you to subtract a pointer from a pointer, as well as subtract an integer from a pointer. The name should be closely related to subtraction.
It would also avoid confusion, since the parameter order of other methods in the class if dest
before src
. Therefore, with a name like Offset
Was writing this post when I saw @jkotas' update.... :) Everything in the updated proposal looks good.
@jkotas I really like AreSame
. Reminds me of xUnit's Assert.Same
... so :+1: from me.
To avoid specifying two type parameters in the As
, it might be possible to use the Cast.From trick.
public class Unsafe
{
public static class As<TTo>
{
public static ref TTo From<TFrom>(ref TFrom source)
{
//magic
}
}
. . .
}
int x = 1;
ref uint y = ref Unsafe.As<uint>.From(ref x); //TFrom is inferred from the arg
@VSadov This appears to be giving a compiler error: http://tryroslyn.azurewebsites.net/#K4Zwlgdg5gBAygTxAFwKYFsDcAoADsAIwBswBjGUogQxBBgGEYBvbGNmfYsmANwHswAExgBBEAB4AKgD4AFHwIArVKWQwAZnz4BKZq3YBffW2MdCJcpRp0xU6aZbsYRg0A==
@nietras 'ref' in C# does not apply to the type, it applies to the signature of the method. ref int Foo()
is still considered as having a type of 'int', just instead of returning a value of some variable, it returns an alias of the variable itself.
It is observable in type-specific scenarios such as overload resolution or type inference. If you have overloaded methods Test(int) and Test(char), you can do Test(Foo()) and the method that takes int will be called. That is because the return type of Foo is int and for all purposes it works as a method that returns an int. The part that it is 'ref' just makes it a variable/LValue, so you can do some extra stuff with it - like passing it by reference or assign to it.
In such prospective 'ref void M()' would not make much sense. Void method does not return anything. And 'ref void' does that by reference?
class Program
{
static void Main(string[] args)
{
int x = 1;
ref uint y = ref Unsafe.As<uint>.From(ref x);
}
}
public class Unsafe
{
public static class As<TTo>
{
public static ref TTo From<TFrom>(ref TFrom source)
{
//dummy implementation
return ref (new TTo[1])[0];
}
}
}
@VSadov There is already an existing method Unsafe.As<T>
that converts from an object to a T.
@jamesqo - I did not realize that Unsafe is an already existing class and has something in it. Anyways, it is just a suggestion. It could be implemented with a different name or not at all. The updated proposal seems good enough actually.
Not having to specify two types in 'As' would have mostly an aesthetical value. In reality you are still supplying two type arguments, just by splitting them between type/method you could let the compiler to infer the TFrom one from the argument.
@nietras Since it's possible to elide specifying both types if we have an inner class, do you still stick to your earlier position of using As
?
AreSame
I really like this too, much better. MSTest uses it as well. In fact object.ReferenceEquals
should have been named AreSame
too. :+1:
do you still stick to your earlier position of using
As
?
@jamesqo good question. I did think a bit about it before, but didn't think there was precedence for doing something like that. I would specify it as:
public class Unsafe
{
public static class To<TTo>
{
public static ref TTo From<TFrom>(ref TFrom source) { ... }
}
}
The question then is, whether:
int x = 1;
ref uint y = ref Unsafe.To<uint>.From(ref x);
is better than:
int x = 1;
ref uint y = ref Unsafe.As<int, uint>(ref x);
? In this case, we don't really save much regarding typing. To/From
does read better and has the benefit of not having to be explicit about TFrom
type, which makes it more flexible. However, I also like that all reinterpretations start with As
, since this makes discovery easier. I could live with both, but am not sure I prefer one from the other. :neutral_face:
Void method does not return anything. And
ref void
does that by reference?
@VSadov couldn't the same argument be made with void*
? We return a void
by pointer?
void* Foo(int*)
This seems contrary to the logic that void
should mean nothing, in this case it rather means "typeless" while it should really mean of type void
, as in F#. ref void
is then just a typeless ref
, not sure I follow completely from "ref int Foo()
is still considered as having a type of int
" is hard to grasp for me, is this just for overload resolution or other specific scenarios? Does this exclude ref void
as an input parameter? I understand that return void
is treated as having no return, but should ref void
really be treated the same way? Isn't ref void
closer to void*
than void
itself?
I would say ref void
method returns a ref
primarily but then with no type or type void
. I probably do not understand all the issues here...
int x = 1;
ref uint y = ref Unsafe.To<uint>.From(ref x);
Doesn't suggest reinterpret "in-place", but transfer and change.
void*
is a pointer to an unknown type. You can't do pointer arithmetic on it due to its unknown size, and must be cast to a type before dereferencing (c++).
void* ptr;
void* ptr2 = ptr + 1; // nope
void* ptr3 = (void*)((char*)ptr + 1); // ok
void thing1 = *ptr; // nope
auto thing2 = *ptr; // nope
auto thing3 = *(char*)ptr; // ok
Is what you are asking the ability to cast a ref
to a ptr? Which is already an operator in c# &
int x = 1;
ref uint y = ref Unsafe.As<int, uint>(ref x);
uint* pY = &y; // cast to pointer
void* pV = (void*)&y; // cast to void pointer
I like this proposal. One small question: is there anything actually unsafe about AreSame<T>(ref T, ref T)
? Should we consider putting it somewhere less scary?
ValueType.ReferencesEqual<T>(ref T, ref T)
:trollface:
Doesn't suggest reinterpret "in-place", but transfer and change.
That is true I guess it could be improved by calling it ToRef
and perhaps even FromRef
but then it gets even more verbose than As
.
Is what you are asking the ability to cast a ref to a ptr?
No, and it wouldn't support the scenarios that ref
can either since generic pointers are not supported. In addition, converting to pointers may cause a GC hole, since pointers are not "tracked" by the GC, but ref
s are as far as I understand.
anything actually unsafe about AreSame
(ref T, ref T)
How would you write it using normal C# code? Although, that of course is not the same as it needing to be unsafe...
Omitting Subtract(ref,ref) for now because of it is not very useful with byref locals and returns anyway, and there are naming and design issues around it.
@jkotas Been thinking about this. Can´t ref
be used for both managed and unmanaged memory? How is that handled in regards to the GC?
I think the parameters to AreSame should be called "left" and "right". This is to mimic conventions we use for operator== overloads.
How would you write it using normal C# code?
For int
s
unsafe bool AreSame(ref int left, ref int right)
{
fixed (int* pLeft = &left)
fixed (int* pRight = &right)
{
return pLeft == pRight;
}
}
Can't really do it for generic types?
Though it might be more a normal thing to test (than unsafe); like if you were given two structs from an array. might want to check if they are the same one.
Can ref
s be zero (null)?
Can´t ref be used for both managed and unmanaged memory? How is that handled in regards to the GC?
Yes, refs can be used for both managed and unmanaged memory. The GC does not touch them if they point to unmanaged memory. Basically, the algorithm for refs during the GC root scanning is: if (does pointer point into GC heap) { track the pointer }
. It is also the reason why it is problematic to have refs stored as fields of GC heap allocated object: the "does pointer points into GC heap" is expensive operation. Stacks are relatively small and so having them on stack-only is acceptable.
parameters to AreSame should be called "left" and "right"
Fixed.
Can
ref
s be zero?
Yes.
In unsafe context only?
Would be interesting to have Unsafe.IsNullRef(ref valueRef);
.
Is it too much? )
update: May be it is just easier to use pointers then.
Roslyn is adding support for ref returns and locals (https://github.com/dotnet/roslyn/issues/118). S.R.CS.Unsafe should provide operations that allow taking advantage of ref returns and locals in unsafe code.
Edit: Updated with the revised proposal