Support for passing objects by ref to OnNext

ScottKane commented 2 years ago

I have recently been using Rx for a simple event bus to publish/subscribe to events throughout my application, the system needed to allow events to be marked as handled and in that case, they wouldn't propagate any further. I realised that with Rx, everything is a copy so there is no way to update the source event.

I created a RefSubject<T> and IRefObserver/IRefObservable using public delegate void RefAction<T>(ref T item) and passing everything by ref which works nicely. The problem is that I would have to implement every operator manually, not so bad right now as I'm only using Where but it would be nice if passing by ref to OnNext was officially supported.

Is there a major reason that it's not in the library right now? Thanks

ScottKane commented 2 years ago

Simple (rough draft) demo can be found here: https://github.com/ScottKane/RxTest

ScottKane commented 2 years ago

I converted the whole of rx to work on events by ref and published it to nuget.

source: https://github.com/ScottKane/System.Refactive nuget: https://www.nuget.org/packages/Rx.Refactive

glopesdev commented 2 years ago

the system needed to allow events to be marked as handled and in that case, they wouldn't propagate any further. I realised that with Rx, everything is a copy so there is no way to update the source event.

I'm not sure I understand the need of using ref from the use case you presented. Wouldn't it be enough to simply make your event type a class? For instance CancelEventArgs seems to have this exact functionality.

akarnokd commented 2 years ago

I don't understand the use for this. If I'd need to modify a field, I'd simply pass around the parent object (in some form), i.e., indirection.

ScottKane commented 2 years ago

I'm using structs for performance reasons as there will be thousands of events firing that have no need to live on the heap. I would be using ref structs to ensure they stay stack allocated if I didn't need an interface to determine if the event has been handled. It's the reason I have a concurrent dictionary of type and object instead of just a Subject<IEvent> so that when I call OnNext the struct isn't boxed.

idg10 commented 1 year ago

Is there a major reason that it's not in the library right now? Thanks

There are two main problems with supporting this throughout all of Rx:

a lot of activity in Rx is deferred via schedulers.
many operators retain values as an intrinsic part of their operation

In these cases, values destined be passed into OnNext need to be retained until the deferred activity occurs. That causes a problem for what you said here:

I would be using ref structs to ensure they stay stack allocated

How do you know when it's safe to allow the relevant stack frame to go away? When you pass a ref argument, the .NET runtime guarantees only that it will be valid until the method you pass it to returns. So you IRefObserver<T> essentially requires OnNext to have fully finished all processing of its input by the time it returns. But it is a basic assumption in Rx today that OnNext might return long before it is done with its input, for the two reasons stated above.

Let's look at some of the consequences of this.

There are some obvious cases that can't work if an observable source supplies items by ref, such as Delay. A Delay operator configured for a 1 second delay has to hold onto any items it receives for a second before forwarding them. The problem with ref is that in the general case, you can't hold onto the things after you've returned to your caller.

Of course, things have changed slightly with C# 11.0/.NET 7.0 introducing support for ref fields, but these are highly constrained, with the effect that this doesn't really help. With these (and in fact with spans before that) we have a slightly broader concept of a ref-like type. And the thing about ref-like types is that they are constrained in similar ways to ref arguments: since it's always possible that a ref is pointing to some stack frame above you on the call stack, a method can't safely hold onto any ref-like thing after it returns unless that method is part of a type that is also ref-like, in which case you've just pushed that constraint up one level to the code that's using you. (E.g., spans are allowed to hold onto refs, but that then means that use of a span is effectively subject to the same constraints as use of a ref.)

Fundamentally, if you have a feature such as you are proposing, anyone could write this:

public int FlowStackRef(IRefObserver<int> o)
{
    int value = 42;
    o.OnNext(ref value);
    return value;
}

This method passes a reference to variable that lives on its stack frame. That variable will no longer exist once it returns, so whatever OnNext does with that ref value, it has to do so before it returns. This rules out any deferred activity.

Several operators perform work via the scheduler. I've just done a quick search to get a rough idea, and found the following (note: I did this quickly, so this might be incomplete):

AppendPrepend
Generate
Range
Repeat
TakeLast
Timer

OK, that's not a huge list, and it's certainly possible to imagine a version of Rx that doesn't have these but which is still useful. But there are also operators which (at least in certain forms) retain values of either their input or output types, and these include:

Aggregate
Buffer
CombineLatest
Delay
Distinct
DistinctUntilChanged
First, FirstAndDefault (blocking form)
Last, LastAndDefault (both blocking and async forms)
Latest
Materialize
Max
MaxBy
Min
MinBy
MostRecent
Next
Repeat
Return
Sample
SequenceEqual
Single and SingleAsync
SingleOrDefault and SingleOrDefaultAsync
SkipLast
TakeLast
TaskLastBuffer
Throttle
ToArray
ToDictionary
ToList
WithLatestFrom
Zip

There are also some internal mechanisms such as PushToPullAdapter that retain values, and which are used internally by a few other operators. (I've not done the analysis to work out the full extent to which that would cause problems for your proposed feature.) Some of the subject types also retain values.

Because all of these have to hold onto one or more of their inputs indefinitely from time to time to be able to perform their function, they are fundamentally incompatible with the code snippet shown above, in which the input (a ref int) ceases to be valid once OnNext returns. And since such a code snippet would always be permissible with your definition of IRefObserver<int> I don't think it would be possible to have any of these operators (except for maybe a handful which offer specialized forms that can get away without retaining values) in a ref-supporting version of Rx.

There are also scenarios where the element type gets used as a type argument for other types (e.g., in TimeInterval) which could end up being problematic, so the list above describes only one set of problems, and there will be other issues for some other operators.

And of course anything where the application explicitly uses schedulers such as ObserveOn will be either limited or impossible.

This is not to say that you couldn't identify some subset of Rx that could be made to work with ref. In fact you already have: if you only want Where then it is possible. No doubt there are some other operators that could also work in this world. But there are many that couldn't. And based on the cursory analysis I've just done to produce this reply, I think you'd be looking at a pretty anaemic subset of Rx.

I'm not mad about the idea of introducing a new dimension to Rx in which so many things are unavailable. Is it really Rx if you don't have any of the above? (I'm struggling to think of any non-trivial project in which I used Rx that didn't use at least one of the features that would be unavailable in a ref world.) So I think I would want to see considerable evidence that a lot of people really want this. So I'll leave this work item open for people to comment or vote on, but unless there is a lot of demand, I don't think we'll be adding this in the foreseeable future.

ScottKane commented 1 year ago

Thanks for the great write-up, it's definitely illuminating. I created the ref port as a quick test to see if the usage I needed would be viable/work as intended. I was writing a game engine in C# in which the internal event systems were running on Rx. The problem was that these constant events powering all aspects of the engine fanning out from a main loop were causing insane GC allocations/collections.

I liked the elegance of using Rx for this, it felt cool, but with obvious drawbacks. I switched to using structs and ported the ref version of Rx to see if I could create/query/submit events without any GC allocations (granted my use case only required simple operator usage). It turned the whole Rx-based engine thing into a viable product. I'm not saying my mad 6-hour rampage through the Rx library spamming ref and ref Unsafe.AsRef throughout is by any means an elegant solution but it did provide the ability to use Rx in a performance-critical core of the engine without the previous downsides. From my testing working with the Scheduler did seem to work, although a lot of the parallel systems were working off a custom thread pool so maybe that's why I didn't have too many issues.

I certainly see the concerns and challenges associated with trying any of this stuff inside a very public and very well-adopted library but I do think there could be some value here

dotnet / reactive

Support for passing objects by ref to OnNext #1814