fsprojects / FSharpx.Collections

FSharpx.Collections is a collection of datastructures for use with F# and C#.
http://fsprojects.github.io/FSharpx.Collections/
Apache License 2.0
247 stars 78 forks source link

Implement zip on NonEmptyList #40

Closed ErikSchierboom closed 9 years ago

ErikSchierboom commented 9 years ago

The NonEmptyList type currently does not have a zip implementation. This comes in handy quite often.

lulu-berlin commented 9 years ago

Hi. I've made a pull request implementing zip.

I'm wondering, wouldn't it be also nice to have some more conversion functions, such as ofSeq, ofList, ofArray; maybe also toSeq (besides the implicit cast)?

mausch commented 9 years ago

About ofSeq, ofList, ofArray : the only way to implement them safely would be as e.g. 'a list -> 'a NonEmptyList option. toSeq (and also toArray and toList) are already available.

lulu-berlin commented 9 years ago

Hi @mausch,

I've responded to your review of my pull request there.

I mean something like:

let l = [1;2;3]

let nel = NonEmptyList.ofList l

rather than:

let nel = NonEmptyList.create (List.head l) (List.tail l)

(Note that the second, verbose version will raise an exception just as well if the list is empty.)

mausch commented 9 years ago

The way I recommend to create a NonEmptyList from a list is:

let nel =
  match l with
  | x::xs -> NonEmptyList.create x xs
  | [] -> // handle that however you like

Which is basically equivalent to 'a list -> 'a NonEmptyList option`. That is, it's the client code who controls explicitly whether to throw or not, and this forces you to consider what happens if the source list is empty.

These partial functions are the cause of lots of bugs... Many people even suggest removing them altogether from standard libraries. [1] [2] [3]

Of course, as you say F# and even FSharpx already has lots of partial functions. But shouldn't we try to avoid adding even more, especially to a foundational library like FSharpx?

lulu-berlin commented 9 years ago

I share the sentiment against partial functions. It's just a matter of being consistent with the standard, but leaving out such functions is a valid option. I do think that functions that return an option should be named differently (like safeHead in the first link you gave). So in this case, I guess something like tryGetList.

But this thread started with zip, which is a partial function (unlike Seq.zip). It seems that following your logic, we can either:

I think that the choice here will also reflect on the other functions that were mentioned.

lulu-berlin commented 9 years ago

Thank you, @forki, for merging the pull request. Now, what about the conversation I had with @mausch?

There's the question of ofSeq, ofList and ofArray: should they not be implemented (because the would be partial functions), be implemented returning an option type, and in this case, should they be called tryGetSeq, tryGetList and tryGetArray?

This can get into a new issue, but it seems to me that the a similar logic should be applied to zip. In other words, should zip return an option type, or alternatively, be implemented like Seq.zip (perhaps actually calling it), zipping only as many items as the shortest NonEmptyList?

For compatibility, the two solutions can be merged: with zip zipping items until any of the two lists is exhausted and tryZip assuming the two lists have equal lengths and returning an option type.

mausch commented 9 years ago

Let's just follow the pattern set by FSharp.Core. Which means that this is ok as zip with this signature, just like List.zip. Seq.zip is different because it's lazy, which is not the case of NonEmptyList, which is just a special case of a regular list. Constructors returning options are not worth it IMHO, instead I'd just pattern match on the original collection and check if it's empty. Also after our last discussion I think it's too late for FSharpx to try to be free of partial functions. It was never an explicit goal anyway. I'd love to see such a library (I've fantasized about rewriting LINQ without any partial functions, TTBOMK there is nothing like that yet), but FSharpx is not it.

lulu-berlin commented 9 years ago

OK. So we'll leave it at that. Thanks for your answer.

I'd also be happy to have such a library, if you ever come around to it.

mausch commented 9 years ago

Bottom line: I think it's "ok" to have partial ofXXX constructors in FSharpx.

lulu-berlin commented 9 years ago

Oh, I thought you wanted to pass the burden of writing them to the client. So, shall I implement the partial ones for NonEmptyList?

mausch commented 9 years ago

Yes

gusty commented 9 years ago

A bit off topic, I know ;)

Just wanted to point out that total functions returning options have also their limitation in real world scenarios and now I notice they are starting to populate the F# core library, typically named tryFunctionName and also there are many already in the .NET framework returning an output parameter which in the end is isomorphic to the option type.

The problem is that they tell you that something went wrong but not what, whereas those that fail tell you exactly what and where the problem was.

A better approach would be to return a Choice<'t, string>, the other two functions can be derived from this one.

I worked in a project where we started with exceptions but because the calculations were very likely to fail we moved to options but the users wanted to know what was wrong, so we went for the Choice<'t, Exception> approach. The Exception type has even more information than the string.

Fortunately I see also this pattern emerging in F# code and the convention seems to be to use a type alias of Choice as Result of Success or Failure, though no convention on the name of the functions (would be nice to have one).

lulu-berlin commented 9 years ago

Interesting point.

Well, I was looking again at Scott Wlaschin's "railway oriented programming" that you seem to be tacitly referring to. At first I wanted to say that with the railway approach you would name the functions normally and treat the Choice<'TSuccess, 'TFailure> (where 'TFailure can be a string, an exception or what-have-you) as the functional way for exception handling. But then, at a closer look, I noticed that all the functions he was talking about are "validateThisOrThat".

So basically, a canonical Success/Failure function (trying to do something that might fail) is a "validateF", whereas any other function that is prone to raise an exception (coming from the impure non-functional world) can be called with tryCatch, i.e.,

let tryCatch f = try f |> Success with ex -> Failure ex

used as an adapter.

I guess that in the parsing example, you could have a validateStringToParse function that returns a Success stringToParse or a Failure ExceptionWithInformation, that would be followed by the actual parse only on success using the railway method. If parse is still exception-prone for non-validatable reasons (stack overflow, the zombie apocalypse), it could still be called with tryCatch as a last resort.

So there seems to be a naming convention emerging here: "try-" = Option, "validate-" = Choice<'TSuccess, 'TFailure>, and no prefix is often a sign of a partial function that would require wrapping.

gusty commented 9 years ago

Indeed I wasn't referring to that article, but just had a look and it seems to be aligned with my thoughts. Also found interesting there the discussion in the comments about using a DU for the failure and then transforming it as it travels different layers.

But I think the validate convention is a particular case. In my case I was working with math functions which calculate-or-fail but they didn't validate anything at all.

Take for instance the usual example of the total division, in the Haskell community seems to be an implicit agreement in using the word 'safe', here in .NET probably would be named tryDivide but then would be expected to return an option, so how would you call the (most interesting) version that returns a wrapped error in a Result?

validateDivide, tryDivide', attemptDivide, safeDivide, divideOrException?

Hopefully I'm wrong but I think there is no naming convention at all and I think is unfortunate that in .NET world the tryF naming convention is being used for the (less interesting) option resulting functions since try is more linked to exceptions than options.

lulu-berlin commented 9 years ago

Well, a division has only one possible way to fail, right? What additional information will you gain from the exception? In this case, I don't really see the added value of not using an option type (and calling it tryDivide), or just manually matching the divisor for zero to handle this case.

I'm wondering what kind of math functions may fail upon calculation in different ways, where the exceptions bear interesting information. Could you give an example?

gusty commented 9 years ago

Hmm the division operator, as defined in C#/F# yes, but that's not a true division, it's rather an arbitrary mix of 2 different functions depending on which type operates. That's an interesting discussion (in Python they did the same mistake) but to name the inconsistencies:

The true division also exists for integers and it's closed only on those values where the dividend is multiple of the divisor.

To continue with the inconsistencies just want to mention that the modulo operator acts as integer modulo also for floats as if the division there was the Euclidean instead of the true division (???)

Anyway, it was just a trivial example but to answer your question I was using functions that might fail in different ways, for example a derivative computed by Newton-Raphson can return a value, diverge, not converge after maxIterations or eventually reach the limits of the type precision (0 or infinite).

Still in the case of a function with a single point of failure (as the funny (/) operator or tryHead) when combined with other functions that may fail you will end up always adapting the results in order to produce the right type and inject the error message but let's face it, after doing this several times you will create your own version of the function.

In the end what we will be doing is making an effect (in this case the exception) explicit in the type. F# and .NET Guidelines suggest that you should do this when the Exception is not really exceptional but where's the limit? In my case for sure it was not exceptional, less than 90% of the user calculations produced a value everyday because of numerical reasons. And what about the case discussed here: the empty list, is it exceptional? Do the Guidelines suggest to use partial functions? I'm afraid the answer is yes.

Finally, I don't think the try functions returning options are useless, they make sense when the action is to do something else but not to abort reporting an error. I'm sure after some time when try functions are everywhere we will see more ... (what's the convention?) ... Choice-functions emerging.

mausch commented 9 years ago

I too prefer Choice to Options in my app-level code. I tend to return Choice values when the function can have multiple reasons to fail... which is not the case of a function 'a list -> 'a NonEmptyList option.

gusty commented 9 years ago

Yes, I totally agree, but my point was more in the scenario of a base library than in an application. I forgot to mention that the function I gave as an example is a higher order function, so there is another point of failure which is when the function passed as parameter fails, one of those failures typically was a Divide by 0. So if the inner function returns None, you think is fine because you know the only possible cause but the higher order function doesn't because it depends on which type of function was passed, it might be a division, a sqrt (which fails on negatives) or something else.

lulu-berlin commented 9 years ago

In the comments of the "railway" article, Scott Wlachsin suggests using many different DUs for the different use-cases and transforming them as they travel along. This design (if you think it makes sense) means that there is no one-size-fits-all return type. Even if a library sets a DU describing the possible failures, it would have to be transformed when consumed by the client code into the DUs used by the client in that domain. It seems like going a full circle: wouldn't it make sense to simply let the library functions raise exceptions and then catch and wrap them in the client code into the proper case of the client's DU? Unless, of course, there's only one possible way a function may fail, in which case an option type is standard enough.