pseabury / morelinq

Automatically exported from code.google.com/p/morelinq
Apache License 2.0
0 stars 0 forks source link

New Method: Accumulate / Integrate #82

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Sometimes it is necessary to calculate partial aggregates of a query for each 
item in the sequence. For example, calculate the aggregate change of a series 
of transactions, or calculating the cumulative distribution function out of a 
probability density function. The name is somewhat elusive, I've so far used 
Accumulate in my projects, but for math types Integrate could be a better name.

Implementation is relatively simple:

        public static IEnumerable<TResult> Accumulate<TIn, TResult>(this IEnumerable<TIn> src, TResult seed, Func<TIn, TResult> accumulator) {
            TResult prev = seed;
            foreach (TIn item in src) {
                TResult cur = accumulator(prev, item);
                yield return cur;
                prev = cur;
            }
        }

The seed value is never returned in the resulting sequence (otherwise the 
result sequence would be longer than the original).

Sample usage:

    var pdf = new [] { 0.1, 0.2, 0.3, 0.4 };
    var cdf = pdf.Accumulate(0d, (d, prev) => d + prev);
    cdf.Dump(); // Prints 0.1, 0.3, 0.6, 1

Using Accumulate + Last should be equivalent to using Aggregate.

Original issue reported on code.google.com by fsate...@gmail.com on 6 Aug 2013 at 8:57

GoogleCodeExporter commented 9 years ago
Oops, the sample implementation is wrong. The accumulator function is missing 
the TResult first parameter:

        public static IEnumerable<TResult> Accumulate<TIn, TResult>(this IEnumerable<TIn> src, TResult seed, Func<TResult, TIn, TResult> accumulator) {
            TResult prev = seed;
            foreach (TIn item in src) {
                TResult cur = accumulator(prev, item);
                yield return cur;
                prev = cur;
            }
        }

Original comment by fsate...@gmail.com on 6 Aug 2013 at 8:58

GoogleCodeExporter commented 9 years ago
This already exists and is called Scan[1]:

new [] { 0.1, 0.2, 0.3, 0.4 }.Scan((d, prev) => d + prev)

[1] https://www.nuget.org/packages/MoreLinq.Source.MoreEnumerable.Scan/

Original comment by azizatif on 6 Aug 2013 at 9:58

GoogleCodeExporter commented 9 years ago
Ah, indeed. However, the released versions so far miss the most interesting 
version, the one where TAccumulate != TSource. The released ones only sport the 
seedless overload.

Original comment by fsate...@gmail.com on 6 Aug 2013 at 10:17

GoogleCodeExporter commented 9 years ago
That's true, but the issue is being closed because the overload accepting a 
seed has been added to the source[1] and will be part of release 2.0. 
Meanwhile, you can build the new with some gymnastics from existing and 
released operators, including Scan:

new [] { 0.1, 0.2, 0.3, 0.4 }.Prepend(5.5).Scan((d, prev) => d + prev).Skip(1)
// prints 5.6, 5.8, 6.1, 6.5

Above, 5.5 is added to the head as the seed and then removed from the final 
result. And for where TAccumulate != TSource:

new [] { 0.1, 0.2, 0.3, 0.4 }
    .Select((d, i) => new { s = string.Empty, i, d })
    .Prepend(new { s = "acc = ", i = -1, d = default(double) })
    .Scan((a, e) => new { s = a.s + (e.i > 0 ? "," : null) + e.d, e.i, e.d })
    .Select(e => e.s)
    .Skip(1)

// prints
// acc = 0.1
// acc = 0.1,0.2
// acc = 0.1,0.2,0.3
// acc = 0.1,0.2,0.3,0.4

The only thing is that it doesn't read as simply and always requires the use of 
several operators. First the Select neutralises TSource and TAccumulate, then 
makes the seed the head element, does the Scan, then returns just the 
intermediate accumulator states skipping the seed.

[1] https://code.google.com/p/morelinq/source/browse/MoreLinq/Scan.cs#81

Original comment by azizatif on 6 Aug 2013 at 11:09

GoogleCodeExporter commented 9 years ago
I agree, the functionality is already there, so closing it is OK
(although the Fixed or Done status would have been more appropriate, I
guess).

<snip>

I ended up copying the code with a note to remove when MoreLinq is updated ;).

Thanks!

Original comment by fsate...@gmail.com on 6 Aug 2013 at 11:39

GoogleCodeExporter commented 9 years ago
@fsateler: So we'll just pretend that you opened the issue before it was 
addressed and I'll change the status to Fixed. :)

Addressed as Scan overload added in changeset 
4aa10fd53ef7aff11856cecdf354220f69a33045

Original comment by azizatif on 7 Aug 2013 at 9:09