louthy / language-ext

C# functional language extensions - a base class library for functional programming
MIT License
6.46k stars 417 forks source link

A chance to collaborate? #9

Closed johtela closed 9 years ago

johtela commented 9 years ago

Hi Paul,

I have been also writing a library that brings many functional concepts to C#. It is called Flop (Functional Library for Object Programs) https://github.com/johtela/Flop. Like yourself I have drawn inspiration from Haskell and F# and implemented some of their core libraries in C#. I set out to write a full functional language in .NET but ended up writing a library instead :-). The library is not using C# 6.0, but compiles with .NET 4.0, so the code is not syntactically as close to functional languages as in language-ext.

There are quite a lot of similarities to language-ext though, like Option, Either and Unit types, but I haven't polished them as far as you have. Instead, I have been focusing on implementing immutable data structures like list (both strict and lazy), set, map, and sequence, which is an adaptation of the Haskell collection type with the same name. It is implemented as a finger tree which is quite versatile data structure.

In addition to collections, there is an implementation of Parsec style parser combinator. The difference to your implementation is that the input type is also parameterized. It works on top of a stream abstraction allowing thus to parse arbitrary data like binary or tokenized inputs. Of course, the monad abstraction is implemented as Linq extension methods, so that parsers can be constructed with Linq expressions.

Lastly I have implemented property based testing library a'la QuickCheck. This library has the same concepts like generators for random data, Arbitrary type which allows composing complex test data from simpler ones, and the testable properties themselves. The properties are also monads in Haskell which means that we can again use the syntactic sugaring that Linq provides to build them.

I have been on and off with my project for over a year now, and would like to make something truly usable out of it. I see that we have the same goal to write clean functional code in C#, so I thought it would make sense to merge the libraries into one comprehensive "Prelude" library. It could be a toolkit that would make it little easier to sneak in functional concepts to mainstream C# programming. There are lot of good ideas to steal from there, and I think the C# language is kind of converging into a functional language anyway. I see it being kind of hybrid language like Scala that has the OO heritage, but tries to root out the bad habits of imperative programming.

I started to write documentation for the library (https://github.com/johtela/Flop/blob/master/Docs/Introduction.md) but I have only come up with couple of chapters. As far as hobbies go, documenting code is not my favorite one :-) Having said that, there are a lot of really interesting papers that have guided my implementation, and I would be happy to write about the inner workings of finger trees, parser combinators, and property based testing. These were really enjoyable and challenging to write, and you can learn a thing or two just by walking through their implementation.

I was also planning to check out C# 6.0 and refactor the API using the new features. I was waiting for the official launch, since I had bad experiences of uninstalling preview versions of Visual Studio in the past. But this is definitively in my todo list.

Let me know, if you are interested in collaborating, or if you have other plans/visions for language-ext.

Regards, Tommi

louthy commented 9 years ago

Hi Tommi,

I'm open to the idea of collaboration for sure, but I'm not particularly up for merging projects. Let me give you a sense of where I'm at with this and hopefully you'll understand why that is.

About a year ago I wrote the csharp-monad library (which you've seen the repo for on this github account). It started off with similar goals to you of bringing a lot of the Haskell and F# loveliness to C#. I had this utopia in mind where I could write everything as LINQ statements and static functions, and make everything lazy like Haskell.

The reality was slightly more awkward to use and I felt not worth the effort the majority of the time. So whenever possible when I am writing new fresh code that can take a radically different approach, I use F#.

So what's this library for? Well, I am currently the CTO of a software company in London, and we have something in excess of 5 million lines of code in our primary project. It's a web-app that's been built upon for nearly 10 years. It's usually very stable, and is pretty well written. However there are classes of issues which crop up time and time again (as mentioned in the README.md).

It's not possible to go in and refactor everything into a perfect LINQ statement with immutable everything, we have to be pragmatic. But I do still want to shut those bugs down, and I suspect plenty of other people do too. And that's what this library is for. That pragmatic middle ground. That's the reason I put very little emphasis on using Option<T> in a LINQ expression, and instead highlighted its use for taming the output of a method. So this library is:

So bearing all that in mind I took a look at your library (excellent work btw!) to see if there was anything that would cross-over. As you say you've focussed on the immutable data structures; but I feel those data structures fall into the 'not needed in this lib' area, especially as it seems the Immutable Collections library is heading straight for the BCL and lazy sequences can be implemented using yield. It's definitely fun to write that stuff (Here's a toy List implementation I did whilst stuck on a train for a few hours, where List<T> is a delegate: https://gist.github.com/louthy/6fee49d3b21f619a8ead), but I feel it falls into the idealist functional programmer's toolkit rather than the pragmatic one I'm trying to build. I think the parser monad falls into that area too.

With regards to your C# QuickCheck implementation. Have you thought about breaking that out into a library on its own? It seems like it could be a very valuable tool that many wouldn't want mixed up with a general framework (or might be missed).

So, I hope you understand my position. This isn't closing the door, as I say I'm open for collaboration. I just want to make sure it's focussed on solving problems in C# rather than chasing a fundamentalist approach. Obviously if there's anything in this lib that you'd like to use in Flop, feel free :)

johtela commented 9 years ago

Hi Paul,

Thanks for the reply. I understand your position perfectly. If you are not aiming at building a comprehensive functional library, but only tackling the specific problems you mentioned, merging the libraries is not a good idea.

However, I bit disagree about the point you made about the practicality of the "fundamentalist" approach in general C# programming. Retrofitting immutable data structures and monads into a legacy OO project does not maybe make much sense. But in a greenfield project the situation is different. Many people have been exposed to a functional programming lately through F#, Haskell, Scala, Clojure, etc. and they see that C# actually has many functional concepts found in these languages. They would like to program C# in a functional way as well.

The problem is that it is not idiomatic to program C# in a functional way. Many APIs of .the NET are hopelessly imperative making it quite painful to incorporate them in functional code. Tuple and Lazy types are good examples; they implement the basic concepts but with a clumsy syntax, and are used little in other places of the .NET framework. Instead of returning tuples, methods in .NET 4.5 still use the hideous out parameter convention. See the TryGetValue() method in the IImmutableDictionary interface, for example.

This is one of the reasons, I am not too happy with the System.Collection.Immutable library. It does a semi-decent job implementing the purely functional data structures, but the abstractions and APIs are quite lacking. IEnumerable too, although prevalent and very useful, is still very much an imperative abstraction. My ideal framework for C# would contain not only the data structures but also the abstractions and idioms for writing them. Clojure does a very good job in the JVM world defining a bunch functional abstractions and hiding the imperative Jave framework.

I guess my ultimate goal would be to write a library that would be the jQuery of C#. That is, a library that would change the way people write C#. JavaScript programming is nowadays very much synonymous to programming with jQuery. The library has effectively patched a slightly flawed imperative language into something that looks and feels much nicer.

Language-ext in many ways does exactly that as well: it makes C# code look and feel a bit more elegant and cleaner. If .NET framework was perfect already, so many people would not be writing these kind of libraries as you and I do :-)

I also disagree about parser combinators not being practical. I have used them in couple of real projects and to me they are much easier to use than writing a custom top-down parser, or using a Yacc style parser generator. Since parser combinators support only LL(1) grammars with no backtracking, they are very fast too. This is also their only weakness; converting a complex grammar into LL(1) is not always straightforward.

I admit that it is easy to go overboard with Linq expressions and sometimes you can write clearer code with plain old statements. But the composabilty of monadic expressions and the fact that you can really build your code like lego blocks make it very hard not to overuse them. You could call this the Haskell syndrome :-)

Keep up the good work!

Regards, Tommi

louthy commented 9 years ago

but the abstractions and APIs are quite lacking. IEnumerable too, although prevalent and very useful, is still very much an imperative abstraction. My ideal framework for C# would contain not only the data structures but also the abstractions and idioms for writing them. Clojure does a very good job in the JVM world defining a bunch functional abstractions and hiding the imperative Java framework.

Definitely interested in some specifics here if you don't mind? I haven't spent any time with Clojure (other than watching Hickey's excellent talks). Transducers look very interesting, but beyond that, what is there about the IEnumerable abstraction/API (with LINQ) that doesn't work for you? The API is mostly renamed versions of accepted functional names (Select = map, Where = filter, Aggregate = fold etc.). Obviously the structures of lists/seqs in general are different, the cons cell isn't the same. But that would mostly lead to stack overflows if you were to follow that approach in C# (mainly talking about user code that consumes head/tail recursively). So, yeah, curious? :)

I also disagree about parser combinators not being practical.

Sorry, if I gave the impression that I think they're impractical. I think parser combinators are awesome, I just don't think they should be a core framework component, that's all. I may reconsider that later, at least for the core monad implementation itself. But I was trying not to get too bogged down in the monad baggage early on. I think although monads have gotten a lot of attention over the past few years, mostly C# programmers aren't crying out for them. Without higher-kinded types they're just not quite as generally powerful as in Haskell (as well as lots of lovely operators for general case processing of monadic values). Parser combinators are probably the best example where monads are obviously a better choice than writing idiomatic C#.

Language-ext in many ways does exactly that as well: it makes C# code look and feel a bit more elegant and cleaner.

I think this is probably a lot to do with my early goals. Reduce common issues and make C# more elegant. In addition to that, the actor/process library I'm currently adding is to understand that applications have state, so why not put the state into little functional bubbles. So actors are essentially message/state fold operations. It may even be possible to treat the message processing loop as an Expression and deploy remotely. I'm hoping to bring some of Alan Kay's original ideas for OO (message passing) together with core functional components.

Finally I'd like to write LanguageExt.FSharp which enables easy interop between C# and F# (and the language-ext types, like Option, that are in F# too). I already have a number of ideas in this area.

Anyway, interested in your thoughts? :)

Paul (PS: re-opening this, cos I'm not sure if you'll get the notification if it remains closed. Feel free to close it again if you like)

johtela commented 9 years ago

Let me try to sum up what I mean by not being happy about the abstractions. First let's take the IEnumerable interface. The good thing about it is that the interface is supported by practically all traversable structures. That fact makes Linq as useful as it is. This is great, and this is the reason why I also support IEnumerable in my collections.

The abstraction actually consists of two interfaces: IEnumerable and IEnumerator. The latter is a stateful pointer or finger to your data structure, and the former is a dispenser for these pointers. This means that you cannot pass IEnumerators outside your loop structure, because that would most probably mess up your traversal. You can only pass IEnumerables, which will create new enumerators at request.

The downside of this implementation is that IEnumerable is not always the most efficient way to traverse a collection. It can be even less efficient when manipulating a collection with Linq operations like Select, Where, SelectMany, etc. These are implemented in Linq by creating a promise or lazy computation, which when evaluated will apply the computation and traverse the underlying collection.

Assume for examle, that you would like to skip first 100 items in a collection and pass the tail to a function that will use it repeatedly. You can certainly do this by passing coll.Skip(100) to the function. But every time the function loops through the collection, the loop will also step through the first 100 items.

You can fix this by copying the collection to an intermediate data structruce (with ToArray(), for example) before passing it to the function, but sometimes you don't know what the function you call does, so there is no way to guess what kind of performance issues there might be.

Functional collections like list and tree are based on a recursive data structure. They rely on structural sharing to allow efficient implementations. Better abstraction for these kind of collections is something like Seq interface in clojure. That contains the canonical list operations: head, tail, and cons. In Flop library this abstraction is implemented by two interfaces, IStream and IStreamBuilder. The first one is a read-only interface that has the getters for head (First) and tail (Rest), and the second one is the interface used to construct new streams.

You can define a lot of operations using these interfaces, and there is no need to wrap any computations to implement this inteface. The guideline is that all the operations should have (amortized) O(1) implementations. If this is not possible, a data structure should not implement this interface.

This brings me to the next important abstraction: the IReducible interface. It abstracts the fold/reduce/aggregate operation that is found in practically all functional languages. It is basically the dual opposite of IEnumerable. You can do everything with IReducible that you can do with IEnumerable, and even more. You can traverse a collection front-to-back (ReduceLeft()) or back-to-front (ReduceRight()). But instead of being lazy like IEnumerable, IReducible is very strict. It always evaluates the whole collection. Instead of being pull-based, it takes a delegate as parameter to which values are pushed.

It is actually much efficient to traverse tree structures with IReducible than with IEnumerable. You don't need to maintain any intermedate data structure (stack or queue) to remember where you are in the tree. The implementation of the function is trivial for immutable data structures, and it is usually the most efficient way to traverse them. I think the latest innovation of Clojure you mentioned, namely transducers, are actually based on the idea of using reduce as the abstraction to manipulate data structures and with network data streams using the same exact implementation. I cannot say that I fully understand the concept, but that is my impression anyway.

The rest of the immutable interfaces found in System.Collections.Immutable like IImmutableList, IImmutableSet, and IImmutableDictionary are just slightly modified versions of their mutable counterparts. There are not suited at all to build higher level operations and abstractions on top of them. I see very little practical use with those.

I should have probably explained all this in the documentation of Flop, but as I said, I am too lazy to write documentation when I can write code instead. Although in a functional setting laziness is a virtue :-) I will try to explain in the documentation why I think these abstractions are important and how to use them in real, practical code. I would also like to incorporate parallellism to these concepts. If you can make the assumption that the code manipulating the collections is side-effect free, you can automatically parallellize some of the operations making them even faster.

Hope this rambling explains my points a bit. Regarding parser combinators, I was thinking of writing an exmple parser that would parse JSON data. I know I would be reinventing the wheel there, but that would be a good example to show how concisely you can build a parser.

Regards, Tommi

louthy commented 9 years ago

I'd like to take this discussion up more, but I'm currently juggling about 4 different projects (bizarrely two of the projects are very related to this discussion... one is using parser combinators to do super-powered string interpolation and another is an data-store indexing solution where I've been implementing a few collection types using AVL trees) so I don't really have time to go as in-depth as I'd like. If you don't mind I'll try and get back to you next week.

Paul