morelinq / MoreLINQ

Extensions to LINQ to Objects
https://morelinq.github.io/
Apache License 2.0
3.69k stars 413 forks source link

new Acquire overload #405

Closed leandromoh closed 6 years ago

leandromoh commented 6 years ago

in a similar scenario that #404, sometimes some elements of a sequence can not be access, but we must do something with all avaible elements.

scenario :

foreach (var element in source) // if some element can not be reach, we must skip it
{ 
       // do something with element 
}

example of error :

Enumerable.Range(1, 10).Select(x =>
{
    if (x % 5 == 0)
        throw new InvalidOperationException();
    else
        return x;
}).ForEach(p => Console.WriteLine(p));

here we can not access elements 5 and 10, but the others can be printed.

So I propose a new operator that do this handling and return a sequence with all avaible elements.

Propose signature:

public static IEnumerable<T> Available<T>(this IEnumerable<T> source)

example of usage:

Enumerable.Range(1, 10).Select(x =>
{
    if (x % 5 == 0)
        throw new InvalidOperationException();
    else
        return x;
})
.Available()
.ForEach(Console.WriteLine);

now all elements except 5 and 10 are printed.

Draft:

public static IEnumerable<T> Available<T>(this IEnumerable<T> source)
{
    if (source == null) throw new ArgumentNullException(nameof(source));

    var e = source.GetEnumerator();
    bool hasValue;
    bool dispose = false;

    while (true)
    {
        hasValue = true;
        try
        {
            if (!e.MoveNext())
            {
                e.Dispose();
                dispose = true;
            }
        }
        catch
        {
            hasValue = false;
        }

        if (dispose)
            yield break;

        if (hasValue)
            yield return e.Current;
    }
}

I would like to submit a PR for this. What do you think?

PS: the operator name is not defined yet;

atifaziz commented 6 years ago

Generally speaking, I don't think exceptions should be dealt with exceptionally. They should be part of the query:

var xs =
    from x in Enumerable.Range(1, 10)
    select (Item: x, Error: x % 5 == 0 ? new InvalidOperationException() : null) into r
    where r.Error == null
    select r.Item;

q.ForEach(x => Console.WriteLine(x));

However, I can imagine that if the exception is thrown by some third-party code that's not under your control then such an operator can be useful.

I would call it IgnoreErrors but it should allow you to specify the specific errors to ignore, e.g.:

public static IEnumerable<T> IgnoreErrors<T, TException>(
    this IEnumerable<T> source,
    Func<T, TException, bool> errorPredicate) { /* ... */ }

public static IEnumerable<T> IgnoreErrors<T, TException1, TException2>(
    this IEnumerable<T> source,
    Func<T, TException1, bool> error1Predicate,
    Func<T, TException2, bool> error2Predicate) { /* ... */ }

public static IEnumerable<T> IgnoreErrors<T, TException1, TException2, TException3>(
    this IEnumerable<T> source,
    Func<T, TException1, bool> error1Predicate,
    Func<T, TException2, bool> error2Predicate,
    Func<T, TException3, bool> error2Predicate) { /* ... */ }

Up to 3 exception types should be sufficient.

leandromoh commented 6 years ago

However, I can imagine that if the exception is thrown by some third-party code that's not under your control then such an operator can be useful.

Exactly.

I would call it IgnoreErrors but it should allow you to specify the specific errors to ignore, e.g.:

We can not pass a T for predicate because the element are inaccessible or could not be projected (and therefore dont exist).

That said, I think a better signature would be the following:

first overload:

public static IEnumerable<T> IgnoreErrors<T>(
    this IEnumerable<T> source,
    Action<Exception> exceptionHandler)

second overload:

public static IEnumerable<TResult> IgnoreErrors<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, Exception, TResult> selector)

in the second signature, either a TSourceor Exceptionwill be passed to selector. Never both at the same time.

atifaziz commented 6 years ago

We can not pass a T for predicate because the element are inaccessible or could not be projected (and therefore dont exist).

Sorry, that was a mistake. You are correct and what I meant what this:

public static IEnumerable<T> IgnoreErrors<T, TException>(
    this IEnumerable<T> source,
    Func<TException, bool> errorPredicate) { /* ... */ }

public static IEnumerable<T> IgnoreErrors<T, TException1, TException2>(
    this IEnumerable<T> source,
    Func<TException1, bool> error1Predicate,
    Func<TException2, bool> error2Predicate) { /* ... */ }

public static IEnumerable<T> IgnoreErrors<T, TException1, TException2, TException3>(
    this IEnumerable<T> source,
    Func<TException1, bool> error1Predicate,
    Func<TException2, bool> error2Predicate,
    Func<TException3, bool> error2Predicate) { /* ... */ }

Why an action for exceptionHandler? It implies a side-effect, which should be avoided if it can be helped. What if the exception type is not good enough? The predicate-based signatures are open to more scenarios. For example, you have a WebException and you want to filter further on its Status.

Both overloads you propose are hard to reason about when you look at the name of the method or what it's intended to do, which is ignore errors. The predicates-based overloads allow the user to say which errors should be ignored.

leandromoh commented 6 years ago

Both overloads you propose are hard to reason about when you look at the name of the method or what it's intended to do, which is ignore errors.

At this first moment I am not worried about the method name, but about its behavior. What I really want/need is a method that give to user the control of the situation. that is, give me all the available elements, and for those which could not be processed, tell me what happens. The second operator i suggest give me this.

Why an action for exceptionHandler?

for the first overload I thought something like Pipe:

var availableElements= elements.Available(ex => 
{
   if (ex is WebException) // do something
   else if (ex is ArgumentException) // do something else
   //others types of excception ignore
}); 

However I think my second overload is the most "powerfull" and versatile, as the following example:

sequence
    .Available((item, exception) => new { item, exception })
    .ForEach((x, index) =>
    {
        if (x.exception == null)
            Console.WriteLine($"index {index} give me {x.item}");
        else
            Console.WriteLine($"index {index} throws an {x.exception.GetType().Name}");
    });

outputs:

index 0 give me 1
index 1 give me 2
index 2 give me 3
index 3 give me 4
index 4 throws an InvalidOperationException
index 5 give me 6
index 6 give me 7
index 7 give me 8
index 8 give me 9
index 9 throws an InvalidOperationException

Since the second overload give us all the available information, we can create the other overloads based on it. For example:

The "exceptionHandler" overload in terms of the second.

public static IEnumerable<TSource> Available<TSource>(
this IEnumerable<TSource> source, 
Action<Exception> exceptionHandler)
{
    return source.Available(Tuple.Create)
                    .Pipe(x =>
                    {
                        if (x.Item2 != null)
                            exceptionHandler(x.Item2);
                    })
                    .Where(x => x.Item2 == null)
                    .Select(x => x.Item1);
}

my initial draft in terms of second (when user is not worried about exceptions).

public static IEnumerable<T> Available<T>(this IEnumerable<T> source)
{
    return source
       .Available(ValueTuple.Create)
       .Where(x => x.Item2 == null)
       .Select(x => x.Item1);
}

your predicates-based onverloads in terms of second overload:

public static IEnumerable<T> Available<T, TException>(
this IEnumerable<T> source, 
Func<TException, bool> errorPredicate)
{
    return source.Available(Tuple.Create)
        .Pipe(x =>
        {
            if (x.Item2 is TException e1 && !errorPredicate(e1))
                throw x.Item2;
        })
        .Where(x => x.Item2 == null)
        .Select(x => x.Item1);
}

The second overload looks to me the MVP, since others overloads can be created using it.

atifaziz commented 6 years ago

At this first moment I am not worried about the method name, but about its behavior.

In API design, the method name can bring a lot of clarity about behavior. For example, whenever I've had a hard time coming up with a good name, I've realized that the method is perhaps doing too much and needs to be broken down into smaller parts, each of which is then easier to name more meaningfully.

However I think my second overload is the most "powerfull" and versatile, as the following example:

Your second overload? I think you only proposed one for Available. I'm confused now.

I'm convinced that what you're asking here too is accomplished simply by a filter (Where) over SelectWithError as I suggested in #404.

leandromoh commented 6 years ago

In API design, the method name can bring a lot of clarity about behavior. For example, whenever I've had a hard time coming up with a good name, I've realized that the method is perhaps doing too much and needs to be broken down into smaller parts, each of which is then easier to name more meaningfully.

I agree.

Your second overload? I think you only proposed one for Available. I'm confused now.

Here is the second overload.

I'm convinced that what you're asking here too is accomplished simply by a simple filter (Where) over SelectWithError as I suggested in #404.

SelectWithError is great, but remember it uses source.Select, that is, there is not a try catch on the source's enumerator MoveNext call. So in my example of usage, it would break the enumeration. Also, beside the Func<T, Exception, TResult> selector, should have on without the T argument, since it could not be projected. Adding this two changes, I think we can deal with all error scenarios.

I would like submit a PR for this. What do you think?

atifaziz commented 6 years ago

but remember it uses source.Select, that is, there is not a try catch on the source's enumerator MoveNext call

True but it was just a quick prototype. I was more concerned with API design and signature than implementation correctness at this stage.

second overload:

public static IEnumerable<TResult> IgnoreErrors<TSource, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, Exception, TResult> selector)

in the second signature, either a TSourceor Exceptionwill be passed to selector. Never both at the same time.

I would like submit a PR for this. What do you think?

I don't think we're ready with this yet and needs more discussion on the following items:

leandromoh commented 6 years ago

What about this signature?

        public static IEnumerable<TResult> XXXXX<TSource, TResult>(
            this IEnumerable<TSource> source,
            Func<TSource, TResult> elementSelector,
            Func<Exception, TResult> exceptionSelector,
            Func<Exception, bool> errorPredicate)

[Edited]

I was thinking about this operator name and the name Acquire explain great its behavior. In fact, the current Acquire operator can be written in terms of this new method, what demonstrate a near relationship between them. So I propose the above signature as a new overload of Acquire . This new overload is just a more tolerant version of the current Acquire, letting user decide which exception can be acceptable

Current Acquire written with this new overload:

public static TSource[] Acquire<TSource>(this IEnumerable<TSource> source)
    where TSource : IDisposable
{
    if (source == null) throw new ArgumentNullException(nameof(source));

    var disposables = new List<TSource>();

    return source.Acquire(
                         x => x, 
                          _ => default(TSource), 
                         ex => {
                             disposables.ForEach(d => d.Dispose());
                             return true;
                         })
                  .Pipe(x => disposables.Add(x))
                  .ToArray();
    }

@atifaziz I would like to submit a PR for this new Acquire overload. What do you think?

leandromoh commented 6 years ago

closed unintentionally.

leandromoh commented 6 years ago

@atifaziz What about this signature? I think it is okay with the items you pointed.

public static IEnumerable<TResult> Acquire<TSource, TResult>(this 
    IEnumerable<TSource> source,
    Func<TSource, TResult> elementSelector,
    Func<Exception, TResult> exceptionSelector,
    Func<Exception, bool> errorPredicate)

We can add the behavior that ignore/skip errors as a new operator implemented based on the above.

public static IEnumerable<T> IgnoreErrors<T>(this IEnumerable<T> source)
{
    return source.Acquire(x => (Item: x, Error: (Exception) null),
                          e => (Item: default, Error: e),
                          e => false)
                 .Where(t => t.Error == null)
                 .Select(t => t.Item);
}
leandromoh commented 6 years ago

@atifaziz How about the above comment?

atifaziz commented 6 years ago

This is still unaddressed:

  • It handles all exceptions; there's no way to say which ones should be ignored
leandromoh commented 6 years ago

There's a predicate where user say which ones should throw.

atifaziz commented 6 years ago

Right, but I meant in terms of exception types, like shown earlier:

public static IEnumerable<T> IgnoreErrors<T, TException1, TException2>(
    this IEnumerable<T> source,
    Func<TException1, bool> error1Predicate,
    Func<TException2, bool> error2Predicate)
    where TException1 : Exception
    where TException2 : Exception
    { /* ... */ }

So yes, while the predicate is there, your signature requires catching any type of exception, which is something I believe should be avoided. If the predicate is just handed the base Exception type then down-casting will be required if the predicate depends on some value on a specific exception type (like WebException.Status to illustrate one example). In the above case, each predicate will naturally deal with the exception type it is designed for.

leandromoh commented 6 years ago

I think maybe we can reach a consensus more easily if we separete the proposals. Let's discuss IgnoreErrors in another thread/issue and new Acquireoverload in this one. The proposal Acquire overload are okay for you? or your previous comment applies to it too?

atifaziz commented 6 years ago

I thought they were one and the same, all stemming from discussion of the original proposal? May be they are forks of different ideas? I'm confused as to where we are headed with this now. 😕

leandromoh commented 6 years ago

I am interested in the new Acquire overload proposed here. Acquire permissive and IgnoreErrors are different ideas. Let's discuss Acquire overload here,

atifaziz commented 6 years ago

Okay, the name needs some thought. Apart from that, I still find exceptions should be handled like proposed for IgnoreErrors.

leandromoh commented 6 years ago

your signature requires catching any type of exception, which is something I believe should be avoided

why?

I also think the order that exceptions will be catch is not obvious when we have a predicate for each exception type.

atifaziz commented 6 years ago

why?

Forcing the practice of catch-all exceptions when it can be avoided.

I also think the order that exceptions will be catch is not obvious when we have a predicate for each exception type.

It should be obvious that they are caught in order: TException1, TException2, etc. This would be no different than writing out the catch blocks yourself.

leandromoh commented 6 years ago

It should be obvious that they are caught in order: TException1, TException2, etc. This would be no different than writing out the catch blocks yourself.

I see. We can add a remark for this to avoid uncertainties.

Forcing the practice of catch-all exceptions when it can be avoided.

Make sense. We can follow this way, at all.

Any other point before I open the PR @atifaziz ?

atifaziz commented 6 years ago

Any other point before I open the PR @atifaziz ?

Name and signature, and I still can't say off the top of my head what this does. :grimacing:

leandromoh commented 6 years ago

I was thinking about operators design and thought this:

1) the objective of the first proposed method (one we are calling as Acquire) is acquire unconditionally all avaible elements of a sequence, that is, even If some MoveNext throws the iteration on source must continue. So API should be

static IEnumerable<TResult> Acquire<TSource, TResult>(this 
    IEnumerable<TSource> source,
    Func<TSource, TResult> elementSelector,
    Func<Exception, TResult> exceptionSelector)

2) the objective of the Second proposed method (one we are calling as IgnoreErrors) is ignore (some cases of) errors. So API should be:

static IEnumerable<T> IgnoreErrors<T, TException1, TException2>(
    this IEnumerable<T> source,
    Func<TException1, bool> error1Predicate,
    Func<TException2, bool> error2Predicate)
    where TException1 : Exception
    where TException2 : Exception
    { /* ... */ }

Item 1 or 2 makes sense to you @atifaziz?

leandromoh commented 6 years ago

Any thought about previous comment @atifaziz ?

leandromoh commented 6 years ago

I see that Rambda library have a function bit similar to item 1, called TryCatch (which is a good name at all): https://bitsrc.io/bit/ramda/utils/try-catch?version=1.0.0

atifaziz commented 6 years ago

Closing on the assumption that #515 supersedes this.