Open joebone opened 4 years ago
The notion of reusing the object instance has merit, but overall is unlikely to be a significant overhead in any scenarios that are talking about data access. I'd want to talk about perceived benefit before jumping in there, because then we also get into issues of lifetime management, disposal / recycling, etc. I suspect the complexity and risk outweigh any expected benefit.
We can't take Span<T>
/ Span<T>.Enumerator
, as they are both ref struct
and cannot be stored anywhere in this context (since we need to implement an interface here, we can't consider a custom ref struct
implementation). In particular, re:
or at least be able to pass an
IEnumerator
such asSpan<T>.GetEnumerator()
you'll notice that Span<T>.Enumerator
is not in fact an IEnumerator
, for this same reason.
Memory<T>
is possible, but has significant issues here, as the .Span
access should be considered relatively expensive - plus we'd need to slice or index into the span each time we progress the thing.
Overall, I'd say that it is unlikely that we can add huge benefit here. Is there a specific scenario that you're thinking of? Also, if there is something we want to do here, then in any scenario that is "bulk", I'd also be tempted to look past Span<T>
/ Memory<T>
, and think more about ReadOnlySequence<T>
, since most "bulk" scenarios usually benefit from discontiguous data support.
First off, thanks for the quick response, wasn't expecting you to be quite so on-the-ball. And to be fair, you may well have a point on the perception of benefit vs the real.
Our particular use case would probably benefit more from the object reuse/pooling on type than from the saved allocations.
The code sample I provided above is not far from the actual implementation we are using, witht he distinction that the fantasy line:
var reader = ObjectReader.Create<T>(slice.Span.GetEnumerator(), members)
is actually just implemented as a slice.ToArray().
Yea.
However I had up until this moment completely forgotten about ArraySegment, which is basically what I was looking for, and does implement the required interfaces..
So I guess the saving would be having to recreate the TypeAccessor and iterate the members each time - admittedly not a lot compared to the catastrophic copies of 100K elements to the LOH, but still.. I'm cheap. If I can avoid paying twice for the same work, I will.
Perhaps a change in scope to the original request then - a Recreate(newSource) or equivalent API that would set the private IEnumerator source?
That still allocates, though. What is worth considering, then:
This is all pretty doable. Consuming Memory-T based code can always use TrgGetArray, and if unsuccessful, use the array-pool to flatten to an array. Useful?
On Mon, 16 Dec 2019, 14:07 joebone, notifications@github.com wrote:
First off, thanks for the quick response, wasn't expecting you to be quite so on-the-ball. And to be fair, you may well have a point on the perception of benefit vs the real.
Our particular use case would probably benefit more from the object reuse/pooling on type than from the saved allocations.
The code sample I provided above is not far from the actual implementation we are using, witht he distinction that the fantasy line: var reader = ObjectReader.Create
(slice.Span.GetEnumerator(), members) is actually just implemented as a slice.ToArray(). Yea.
However I had up until this moment completely forgotten about ArraySegment, which is basically what I was looking for, and does implement the required interfaces..
So I guess the saving would be having to recreate the TypeAccessor and iterate the members each time - admittedly not a lot compared to the catastrophic copies of 100K elements to the LOH, but still.. I'm cheap. If I can avoid paying twice for the same work, I will.
Perhaps a change in scope to the original request then - a Recreate(newSource) or equivalent API that would set the private IEnumerator source?
β You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mgravell/fast-member/issues/79?email_source=notifications&email_token=AAAEHMCNYULFJ47UATUTLY3QY6DT5A5CNFSM4J3IBXM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEG62FDA#issuecomment-566076044, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEHMH46G6Q436G4ESXYWTQY6DT5ANCNFSM4J3IBXMQ .
To be sure I'm understanding correctly, are you suggesting a subclass of objectreader?
I hadn't actually considered that, but yea, would allow more control at the cost of flexibility, although following the whole pit of success axiom it would be useful then to include an analyzer/fixer with the nuget package which advises when the more optimal path can be used(for discoverability purposes).
Or perhaps an internal path like Count on IEnumerable when it is an Ilist?
Either way, just having the class would definitely benefit us π.
I do feel a bit guilty about just crying for features without at least contributing though so maybe I'll branch and have a stab at an implementation myself, if for no other reason than practice.
Well, since the Create method is static, I was assuming we could make the existing implementation code into one sub-class, the array-specific code into another, add a Create overload that takes an array and optional offset/count, and change the existing Create to check for arrays. With that, even people already using arrays get the new code for free.
On Mon, 16 Dec 2019, 17:34 joebone, notifications@github.com wrote:
To be sure I'm understanding correctly, are you suggesting a subclass of objectreader?
I hadn't actually considered that, but yea, would allow more control at the cost of flexibility, although following the whole pit of success axiom it would be useful then to include an analyzer/fixer with the nuget package which advises when the more optimal path can be used(for discoverability purposes).
Or perhaps an internal path like Count on IEnumerable when it is an Ilist?
Either way, just having the class would definitely benefit us π.
I do feel a bit guilty about just crying for features without at least contributing though so maybe I'll branch and have a stab at an implementation myself, if for no other reason than practice.
β You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mgravell/fast-member/issues/79?email_source=notifications&email_token=AAAEHMDJ5JTHBWMAQSKOVZDQY63ZHA5CNFSM4J3IBXM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEG7PLNA#issuecomment-566162868, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEHMGU43GFODOVN2CCF23QY63ZHANCNFSM4J3IBXMQ .
Yea, whatever i was thinking was far more over-engineered than that :p Awesome, sounds great!
In the case where we have massive Bulk Inserts pending ( which seems like a very common use case for this library ;) ), we often have to break those into smaller batches, often 100K or less from Lists or arrays of a couple of million.
Rather than allocating those intermediate arrays, it would be great to be able to pass a Sliced Span, or at the very least a Memory. Internally I see the private source is an Enumerator rather than the IEnumerable that is required by the method, and GetEnumerator is called at the end.
It would be beneficial to be able to either
Basic use case example of the intended use case: