Closed Sergio0694 closed 2 years ago
We would prefer this to be exposed as ref
fields in C# language: https://github.com/dotnet/csharplang/issues/1147 . This would be even faster, leaner, and less error prone than ByReference<T>
.
Hi @jkotas - thanks for linking that issue, I had missed that! π
Is there an ETA for that though? By the looks of it, and the fact it has been opened for over two years already, it looks like it still has a way to go, plus all the additional difficulties caused by the fact that that would require at least some Roslyn updates to support that, if not some CLR changes too.
My rationale here was that making the ByReference<T>
type public wouldn't allow new scenarios not achievable today, but simply make them more convenient for devs that were interested in using them (with the assumption that those devs would already be aware of all the various implications of using unsafe APIs in that case). Even if a ref T
field feature was to be implemented one day, wouldn't this still be a nice and easy to implement stopgap for the time being? Again, given that the same exact feature can still be achieved today anyway, just less efficiently and with more room for mistakes (and the additional verbosity of having to reimplement that fake ByReference<T>
type).
We are here for the long haul and want to do things properly. We try hard to avoid exposing temporary workarounds that we are stuck forever even once the more appropriate solution is available.
Sure thing, that's fair enough π Also, just to be clear, it wasn't my intention to suggest this change as a quick and sloppy solution instead of doing things properly, as I said my rationale was more of an "iterative" improvement for a scenario that's technically supported already, but with an even worse implementation. But I can see how it'd be preferable to jump straight to the optimal and built-in solution, absolutely π
Is there any approximate ETA on that proposal? I see the issue has been open for over two years already and it's not currently assigned to any milestone, not even X.0
or X.X
. Is that a feature that's still being evaluated and not yet decided on, or is it something the various teams have already decided to implement, just not with an exact timeframe for it?
Thanks again for your time! π
It did not make it into the list of C# features we are looking at for .NET 5.
Wouldn't exposing this type make it possible to create ref
fields in languages other than C#?
To me that sounds like something worth considering.
This type as it exists today is not safe construct. It is very easy to create dangling pointers with it, without guardrails in the language. If we were to expose it, we would likely mark it with the special ObsoleteAttribute to block its use in languages that are not aware of it (e.g. similar to how ref
structs are blocked).
On a related note, Jan, you and I previously talked about a syntax that would allow operating with gcrefs as naturally as operating with pointers. Things that would allow natural syntax like the + operator without bouncing through Unsafe.Add
. Would something like that ever make sense to add as a language feature, or as a ByRef-like type, or as something else?
Hey @jkotas - I have a follow up question on this, specifically regarding the general criteria being used to expose APIs or add explicit support for "unsafe" operations. You mentioned how with the ByReference<T>
type is't possible to create dangling pointers, which is certainly true. My question though is about the seemingly arbitrary decision not to make the type public, despite the BCL having a ton of potentially dangerous APIs already. Consider this snippet:
public static Span<T> GetFaultySpan<T>()
{
T value = default;
return MemoryMarshal.CreateSpan(ref value, 1);
}
This compiles fine, but results in a dangling pointer just like the one you mentioned. It's not perfectly clear to me why is this API public, but ByReference<T>
is not. Don't get me wrong, this API is super useful (and I love the fact that it's available!), but I don't get why the same rationale doesn't allow to ByReference<T>
as well: "here's a potentially dangerous but useful tool, use at your own risk!".
Or, just like the BCL already includes dedicated code to properly handle devs doing stuff that's technically not valid (eg. Memory<T>
wrapping strings as mutable), then why not just use the same rationale here as well, and provide a tool that enables devs to do exactly the same thing they can already do with MemoryMarshal.CreateSpan
, but just in a more efficient and less error prone way?
I just can't stress enough how useful the ByReference<T>
type would be, and with the native support for ref T
fields not making the cut for .NET 5 I'd say it'll be at least another year (if not more) until that eventually makes it to a public release? As you said, you could also mark it as [Obsolete]
just like Span<T>
, to avoid issues with other .NET languages.
Just though I'd share my thoughts on this, thank you in advance for your time! π
P.S. Just so I'm perfectly clear, I'm not criticizing your work in any way, the .NET team is doing an incredible job and I really love where the whole platform is going! Just trying to make my case here as to why I think making ByReference<T>
public could be a win-win for everyone. π
@jaredpar had some reasons on why this wasn't safe from the language perspective and would be extremely likely to cause bugs without proper language support for tracking the lifetime.
This compiles fine, but results in a dangling pointer just like the one you mentioned
Methods in System.Runtime.InteropServices are unsafe. They can produce dangling pointers, corrupt memory, etc. You have to know what you are doing when using them and they should be avoided where possible.
extremely likely to cause bugs without proper language support for tracking the lifetime.
Correct. We would like ref fields to have proper language that guarantees its safety.
Ah, yeah that's a fair point, having ByReference<T>
under System.*
would definitely be a problem - I hadn't noticed that before and I had just assumed it was under System.Runtime.*
somewhere like Unsafe
or MemoryMarshal
. I guess it could very well be moved there since right now it's internal, but I can see how that would require more work/refactoring than just making it public π
And yeah I agree that just having language-level support for ref
fields would be even better, absolutely no dispute there. I was just hoping for a better workaround for the time being, seeing as that feature at least for now seems to be pretty far from completion, is all.
Thank you both again for your time, I really appreciate the conversation here!
@tannergooding
had some reasons on why this wasn't safe from the language perspective and would be extremely likely to cause bugs without proper language support for tracking the lifetime.
The reasons are pretty straight forward: the lifetime safety rules, as written today, are predicated on such a constructor not existing. It's explicitly called out in the design.
That's not to say we can't do ref
fields, just that the current design was written explicitly assuming they didn't exist. The initial designs of ref struct
actually had support for them but we removed them because it introduced some heavy complexity into our lifetime rules. Now we think we have a simpler way of doing this and hope to bring ref
fields into a future version of C#.
+1 on this. While the span workaround, well, works, it isn't ideal. Particularly where you have an interop scenario where you are using ByReference<T>
in place of a T*
in the native code, and want it to be blittable so it can be passed when pinned
On a related note, Jan, you and I previously talked about a syntax that would allow operating with gcrefs as naturally as operating with pointers. Things that would allow natural syntax like the + operator without bouncing through Unsafe.Add. Would something like that ever make sense to add as a language feature, or as a ByRef-like type, or as something else?
Having it actually added to ref
s would be a huge breaking change wouldn't it. As currently, +
refers to the underlying object the ref points to
Personally I'd already be more than happy just to have the ref
field C# feature that was mentioned, especially if not having to also work out a new syntax to allow offsetting over ref
-s could make the implementation of these ref
fields quicker. Even just with that, we could write something like:
public readonly ref struct Ref<T>
{
public readonly ref T Value;
public Ref(ref T value)
{
Value = value;
}
public ref T this[int i]
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
get => Unsafe.Add(ref Value, i);
}
}
Which personally I'm very looking forward to, as it'd be a nice and fast abstraction over having to manually use Unsafe.Add
every time you need to offset to some location π
Personally I'd already be more than happy just to have the ref field C# feature that was mentioned,
well yeah, that is just better, but it is far away, will take a lot of compiler work, new language rules, etc. i just want the dangerous scary thing exposed that if you f#@! up then it doesn't help you
~~If you reeeeeeeally wanted ByReference<T>
, an enterprising individual could write a fully out-of-band implementation. π You'd have to write it in IL instead of C#, but there's nothing stopping anybody from doing it. Said assembly would need to target netcoreapp2.1+ and wouldn't be compatible with netstandard or netfx.
I don't think such an out-of-band type would be blittable or would play well with the interop system. I don't even think the internal in-box type fulfills these characteristics. You'd need to pin the T& to a T* if you wanted it to be blittable.~~
This idea is shot. See below.
@GrabYourPitchforks I am not sure what you have in mind, but I do not think it would work.
@GrabYourPitchforks ByReference<T>
is not an S.R.CS.Unsafe
kind of type which is written in IL. It is an intrinsic'y one that has a special meaning to the runtime afaik
@jkotas I thought as of netcoreapp2.1 the runtime itself had support for byref members of structs, even though the C# compiler doesn't? I was thinking you could write a class entirely in IL that looked something like:
// pseudo-code
public ref struct ByReference<T>
{
private readonly T& _value;
public ByReference(ref T value) { _value = ref value; }
public ref T Value => ldfld(_value);
}
The public API surface uses only constructs that the C# compiler can handle. Admittedly I haven't actually tried this. Just spitballing.
I thought as of netcoreapp2.1 the runtime itself had support for byref members of structs
That is not correct. You should see type load exception if you try to run your example.
@GrabYourPitchforks nope, invalid IL. I believe the current way it works is by having a dummy IntPtr
in the ByReference<T>
type (see here), and specially instructing the runtime that in reality is a tracked ref
Won't even assemble, iirc, @jkotas - FieldSig
has no BYREF
attribute (unlike LocalVarSig
/Param
blob) so it can't be encoded.
Well, there goes that idea. :)
@jkotas
I been scratching my head for for a very long while trying to create a struct that holds reference to a parent.
I am using it in Unity container to build dependency graphs during resolution. I had to use unsafe code to do so but it places certain limitations on where it could be executed. A solution that doesnβt require unsafe methods would be a dream come through for the project but every time I see potential fix it is either internal, or does not compile.
Perhaps you could recommend something that I missed?
You have to use unsafe code for that today.
@jkotas
Any way I could build my own ByReference<>
? If I can build entire framework locally, why not just that single type?
NET is full of dangerous code and could be used badly, I am baffled why that particular type became such a guarded 'golden fleece'? For library authors it could be life saver. The closest thing C# could have to C++.
Write "Don't use under no circumstances!" in big letters in documentation and that will scare away 99% of developers, just these who have no other choice will dare to proceed...
Any way I could build my own ByReference<>? If I can build entire framework locally, why not just that single type?
There is an example of this in the OP, using a private Span<T>
field.
@DaZombieKiller Thank you, but it is rather expensive to create 3 different types (ByReference, Span, FakeByReference) to hold just one reference.
In the IoC I am working on even difference in one reference field in the context structure brings performance down 15% from 14ns to 16.5ns. Multiplied by couple of million resolutions during startup of application adds up a lot.
Any way I could build my own ByReference<>?
I would just use unmanaged pointers. Without the language safety guardrails, ByReference<T>
and unmanaged pointers are equally unsafe.
@ENikS I've added a Ref<T>
type (and readonly version, and Nullable ones too) to the Microsoft.Toolkit.HighPerformance
package, if you're interested. You can find the docs for that here. I also have a PR open (should be merged shortly) that makes it use the actual ByReference<T>
type from CoreCLR when running on .NET Core, so you also get exactly the same codegen/performance as using ref T
fields in that case (if they actually existed). I find that type does the job quite well working around the current lack of support for this feature in C#.
@jkotas I can't use pointer because structs contain managed references.
To give you more 'context', when IoC advances from dependency to dependency it creates a local Context on the stack frame. But unfortunately it also require a reference to parent Context living on stack frame as well. The lifetime is guaranteed to be shorter than the parent because parent is always calling method and survives longer.
Implementing this in unsafe code places restrictions on execution in certain environments. I have to use class instance (I have a class and a struct with the same signature) in place of struct to mitigate but doing so degrades performance due to heap allocations. As you can see, I would sell my next unborn child to have fast native support for ByReference
@Sergio0694
Thank you, I will check it out!
I can't use pointer because structs contain managed references.
You can use System.Runtime.CompilerServices.Unsafe to get around this limitation.
That pull request (https://github.com/windows-toolkit/WindowsCommunityToolkit/pull/3367) has me not just impressed but concerned, as it's exposing a type that is (to my knowledge) considered by the .NET team a temporary workaround and may be removed when a more appropriate ref
field solution comes along. If several projects start relying on it, then wouldn't that place the removal of ByReference<T>
in a bit of a tough spot? Or is this going to be considered along the same lines as reflection (you choose to use it, you choose to accept breakage) for now?
@jkotas
You can use System.Runtime.CompilerServices.Unsafe to get around this limitation.
I can't! It will not run in some of UWP scenarios. Also, Unity is used by a lot of big financial organizations and some place restrictions on Unsafe code in libraries.
It will not run in some of UWP scenarios
Then you need a slower implementation for those. Even if the proper support for byref fields gets added to the language and runtime, this support won't be backported to existing runtimes.
Unity is used by a lot of big financial organizations and some place restrictions on Unsafe code in libraries.
There is no type and memory safe way to do byref-fields today. If your requirement is to use type and memory safe code only, you cannot use byref-fields today.
is this going to be considered along the same lines as reflection (you choose to use it, you choose to accept breakage)
Yes. https://github.com/windows-toolkit/WindowsCommunityToolkit/pull/3367#issuecomment-676557220
@jkotas You are not helping!
The decision seems to be rather arbitrary as there are workarounds that could provide same behavior but with higher costs. Keeping that type internal does not provide any additional measure of safety, it just prevents proper optimization.
Would it be possible to bring in more decision makers into this discussion to weigh pros and cons? Since net5 release is still some time in the future, we have time to properly discuss it...
I've read a lot of @VSadov comments related to this and other issues, perhaps he can provide any suggestions?
Lets start with a short list of people who could make this decision. @jkotas do you know who are these people, could you provide their names?
You can find list of area owners at https://github.com/dotnet/runtime/blob/master/docs/area-owners.md
net5 is feature complete. We are switching master branch to net 6. The changes going into net 5 release branch at this point are fit and finish bug fixes only.
In my view this type being internal is a bug :)
@ENikS About UWP, if you're tracking references to fields within objects on the heap, you can use the same pattern that is used by both the portable Span<T>
type and the Ref<T>
type from the HighPerformance package when running on runtimes without runtime support for Span<T>
(ie. without the internal ByReference<T>
magic type): just have your type contain an object
field to the parent, and an IntPtr
offset for that field. Then you can get a ref T
to that field by just shifting the reference.
You can use the object.DangerousGetObjectDataByteOffset<T>(object)
and object.DangerousGetObjectDataReferenceAt<T>(object, IntPtr)
extensions (still from that HighPerformance package) to do that. They'll take care of all the weird offsetting for you.
This approach works on even .NET Standard 1.4 (so also on eg. Mono, etc.), it's mostly just slower and it can only work for interior references to objects, so it's not quite as flexible. It does the job though and it's still pretty useful in some cases π
@Sergio0694 Excellent suggestion, thank you
@ENikS FYI that's actually exactly how Ref<T>
works too when running on UWP (or .NET Standard <= 2.0 in general).
It offers a Ref<T>(object, ref T)
constructor that does exactly that behind the scenes.
Just in case you don't want to reimplement that from scratch (or, you can look at the source as a reference).
The decision seems to be rather arbitrary as there are workarounds that could provide same behavior but with higher costs. Keeping that type internal does not provide any additional measure of safety, it just prevents proper optimization.
I don't think it's arbitrary. https://github.com/dotnet/runtime/issues/32060#issuecomment-596624746 gave some examples of where the compiler's safety rails fall apart were this type to be publicly exposed.
The comments in this thread from the runtime and libraries and language team are all pretty consistent that we'd like to expose this feature the correct way - via proper ref fields - rather than exposing the internal ByRef<T>
type. Once that change comes online, the ByRef<T>
type will probably be deleted from the runtime. (This also means that the Windows Toolkit PR mentioned earlier will stop working on this new runtime.)
@GrabYourPitchforks I mentioned future support plans in my comment here, that's one of the reasons why we didn't just expose ByReference<T>
directly, but still abstracted it through our Ref<T>
types (and related types), to give us more flexibility.
If with eg. .NET 6 (or some other future runtime) the ByReference<T>
was removed and proper support for ref T
fields was added, we'd just tweak the code accordingly - the HighPerformance package is already multi-targeting a number of different runtimes/profiles specifically to be able to deal with runtime-specific quirks too when needed (eg. we offer fast indexers for arrays on .NET Core that do the unsafe cast through an internal RawArrayData
type mapping the array layout on CoreCLR) π
It is a pervasive assumption in GC that byref fields do not exist. That is not going to change. A lot of things are substantially simpler/faster due to this assumption.
ByReference<T>
gets around this limitation by colluding with JIT - formally there is no ref field, but JIT reports roots as if there is one.
Basically this type works as expected only with a JIT/AOT compiler that knows about it and only in a few scenarios where JIT has a chance to do its thing. That is enough to implement span types.
Supporting this type more generally will at least require a lot of testing on a range of platforms and commitment to keep that working in the future.
@GrabYourPitchforks
The protection provided by keeping ByReference
internal is easily circumvented as shown in OP. But instead of doing it properly, it is done as a slow hack. In a way it is like an abortion, when outlawed, it is still done, but instead of going to a doctor, it is done by charlatans (no relation to anyone :).
@VSadov The type is being used left and right in framework, I don't know how much more testing should be done.
With a push to use value tapes in .NET it is becoming real impedance when it could not be referenced
The type is being used left and right in framework,
ByReference<T>
is as an implementation detail in 3 low level types in CoreLib (Span, ReadOnlySpan, TypeReference). I would not call it used left and right.
Considering that Span is being added to each and every high performance API (http, stream, buffer, pipeline, and etc.) I would say it is.
The protection provided by keeping ByReference internal is easily circumvented as shown in OP.
Keeping things internal isn't meant to be a "protection" per se, so there's nothing to circumvent. It's internal because per the runtime team (of whom many have commented on this thread), we don't feel it's a good type to expose. When we expose a type publicly, at that point we sit down and hash out the full design, including any necessary language support or how the type is exposed by our API surface.
There's lots of other internal functionality that can be accessed via things like private reflection. For the most part we don't go out of our way to block callers from using them - doing that would just lead to an arms race. But we also don't support these scenarios at all, and apps who do this need to be prepared for things for things to break unexpectedly. It would be perfectly valid for the implementation of SomeInternalFrameworkMethod to change from "please multiply two numbers together" on Tuesday to "please reformat my hard drive and delete all my cloud backups" on Wednesday.
The type is being used left and right in framework ... Considering that Span is being added to each and every high performance API (http, stream, buffer, pipeline, and etc.) I would say it is.
I think you've drawn an incorrect conclusion here. You're correct that span is a fundamental building block of high-performance APIs, but I think it's a leap to say that span's internal implementation details are fundamental building blocks. The point of an implementation detail is that we can rip it out and replace it in the future without affecting any callers or consumers. Once ref fields come online and we rewrite spans to use that feature instead of ByRef<T>
, then poof! suddenly ByRef<T>
has exactly zero direct or indirect consumers. Clearly an implementation detail that can be wholesale replaced without affecting any consumers is not a fundamental building block.
The current plan is to add ref fields as first class construct to the type system. See proposal at https://github.com/dotnet/csharplang/pull/3936 for more details.
Original proposal
Overview
Make the
System.ByReference<T>
type (here) public instead of internal.Rationale
It's already possible to achieve the same behavior of
ByReference<T>
, namely to store aref T
to some variable as a field in aref struct
, by leveraging theSpan<T>
andMemoryMarshal
types, like so:This can either be used as is, as a field in some other
ref struct
type, or the same technique (storing a 1-lengthSpan<T>
in place of the missingByReference<T>
type) can be used in otherref struct
types to reproduce this same behavior: storing a managed reference as a field in aref struct
type.Given that this is already possible with existing APIs, as shown above, but it's less performant than the original
ByReference<T>
, more verbose and more error prone, why not just make the originalByReference<T>
type public so everyone can use it with no need to reinvent the wheel?It'd be a neat API to have, and at basically no cost, since it's already in the .NET Core BCL anyway. π