Closed siavash-babaei closed 3 years ago
I wrote the original version but I am no longer actively developing for F#. @tpetricek might be still involved.
We need an active maintainer. Any volunteers?
(Ping me directly on my email if needed - I don't always see notifications here)
Dear @hmansell, since it is now part of FsLab, @tpetricek should be involved although I suppose FsLab itself needs revamping perhaps as a whole.
Incidently, a good comprehensive FsLab environment would certainly make F# more readily competitive with likes of Python and R. Even Julia has pulled ahead in data analytics in terms of capabilities in many senses which is a pity for F#, the language being very mathematical at core and brilliantly suitable for everything data.
I am not sure if or how much R and .Net APIs have changed but I doubt by much. As far as I have seen as a user, R Core has not changed much on the face of it for many years, and .Net 4.0 code can be consumed in .Net 5.0 with minimal changes. Hopefully, updating it shouldn't be a major rework ... just the thought is encouraging!!!
Our very dear BDFL @dsyme: I would offer my help except I don't have the experience of maintaining repos. In other areas perhaps once it takes off ...
@siavash-babaei, I'm glad to see your enthusiasm. Please take a look at an old issue in 2018 discussing about FsLab and data science using F# in general. https://github.com/fslaborg/FsLab/issues/137
Lots of progress have happened since then, especially in the Jupyter notebook through dotnet/interactive kernel. I think interop with Python is in the pipeline according to some talks from Microsoft. I hope interop with R will come some day too. But that might be too big an ask from Microsoft team.
About linear regression. A pull was added to Deedle early this year to support some form of lm
in R
https://github.com/fslaborg/Deedle/pull/496
Take a look at some testing samples https://github.com/fslaborg/Deedle/blob/master/tests/Deedle.Math.Tests/LinearRegression.fs
let actualCoeffs =
LinearRegression.ols ["MSFT";"WMT"] "AES" true stockReturns
|> LinearRegression.Fit.coefficients
Thanx @zyzhu. I doubt Microsoft would get involved in something like RProvider and I am not sure how simple interop with python would be helpful. I mean, for C/C++/Fortran, it makes sense to provide some simple interop so that you can switch and let that handle intensive bits of code, but python?! Not to mention that all the while RProvider was working just fine for a few years, no such TypeProvider for python really took off. Microsoft has already invested heavily in R gobbling up Revolution Analytics for a hefty price and rebranding it as Microsoft R distribution and adding the ability to directly script in R within SQL Server, before doing the same for python.
Now, through ML.NET, Math.NET, and Accord.NET, you get most of what you need from a Machine Learning perspective and they appear to be actively maintained. The problem with all, including the example included above by @zyzhu, is the awkwardness and verbosity.
Again, assuming that we have a data frame scores containing variables score, age, sex. In R, you would do:
model <- lm(data = scores, score ~ age * sex)
and then, from this model
object, you can extract whatever you need, including statistics, coefficients and confidence intervals, error estimates, etc, even diagnostic plots, with some pretty intuitive names.
To me, doing the same thing as above and almost perfect in F# would go like:
let model =
let data = scores
let response = [ "score" ]
let predictors = [ "age"; "sex" ]
(data, response, predictors)
|> linearModel ModelType.OLS CrossEffects.Multiplicative
with model
object perhaps being a record type with fields corresponding to coefficients table, error estimates, basic statistics, etc.
Looking at it from a business perspective. F# was a primarily Windows thing up to now. Even though open-source, it was not properly supported on Linux where a lot of open-source community resides. With .Net 5.0 and F# 5.0, things have changed and now .Net is properly multiplatform, although tooling in Linux I suppose could still go some way. So it is almost like a new start, with the opportunity to expand both the language and the userbase. Something of noteworthy attention is the economic principle of competitive advantage. Basically how entities from nations to corporations to life itself stick to their strengths to survive and grow. Say, Websharper or SAFE Stack: absolutely necessary for a modern language but have they really made a dent in penetrating current market share? I don't think even Typescript is making any significant headways in attractive Javascript users or new ones. In my opinion, for whatever product, you would require a few killer features that would make it indispensable, and for F#, it could easily be the entire data analytics and data science workloads. The same thing that greatly helped propel python to the front. The user base, especially, being more mathematically inclined and comfortable with the syntax (I just love/adore it but dunno why but makes lots of people uncomfortable), ideas of immutability and the core of language being input -> function -> output, would be much better adopters than say, developers active in GUI or web. There are other areas I am sure, for example, business applications that fit nicely with Domain-Driven Design. But data science workloads - incidentally, a perfect match for DDD - are certainly worth the investment, especially as they seem to be exponentially growing both in volume and utilisation. If you think about it, one of the most active open source big data projects, Spark, is only 7 years old. The community seems to be more-so accepting of new tech that makes their life easier.
There are many questions being discussed here. Let's just deal with the question of FsLab and its pieces.
Here are my opinions:
The whole idea of a curated, unified collection like FsLab has turned out to be suspect as it doesn't really allow for change, evolution and deprecation unless very actively curated.
The curation stopped because FsLab as a collection was based on .NET Framework, and some parts of the collection suffered badly in the transition to .NET Core. It only took one part to be still stuck on Mono or .NET Framework to render the whole thing stuck. That's what happened.
With mono out of the loop things are easier once we establish a reasonable landing point
MSFT is active in the parts of FsLab we directly care about - XPlot, fsdocs (which now generate .NET Interactive notebooks), .net interactive, F# literate scripting. It is also active in many related technologies. Other companies also contribute
Reorienting to join forces with SciSharp, .net interactive and similar seems much more practical.
FsLab certainly needs to be taken down and/or revamped on .NET Core only and/or wound up as a "one-stop shop technology". That will create space for better approaches I think. I'm open to suggestions but we need to rethink things.
Note I'm not interested in discussing this from a "future of F#" perspective (this has nothing to do with F# and web programming, for example) but rather just practical steps to get things cleaned up on on a good sustainable coherent basis going forward
I added some notes and thoughts that seemed more appropriate to FsLab as a whole in https://github.com/fslaborg/FsLab/issues/137. I hope they are helpful, certainly don't mean to be criticising or anything ....
Cool let's discuss in https://github.com/fslaborg/FsLab/issues/137
Guidance for Newbies:
Suppose a person with decent working knowledge of both R and F# wants to kinda restart this RProvider project. So what steps should be taken and what should be learnt, before attempting to update/fix it so it works with say, the latest version Microsoft R Open as a stable LTS version. I checked an intro to type provider design on MSDN, examples didn’t make much sense regarding interop with a different language.
I am wondering what has fundamentally changed since R 3.4 that RProvider no longer works after that version. Is it an issue of updating used libraries and packages from .Net 4 to .Net 6 or something in R API has completely changed!!! Given that R is an almost 40 year old language that has not changed much at core at least as far as users are concerned …
@siavash-babaei I was the original author but haven't kept up with developments in the .NET community the last few years. Hopefully the following will be helpful:
I'm going to close this issue, as we have just released the v2.0.0-beta nuget package. Hopefully this should address the issues raised in this thread. Also see #218 for discussion about project maintainance and contribution guidelines. Thanks!
Hi,
With many thanx to the authors and maintainers for such a brilliant feature. It seems this TypeProvider has not really worked since R 3.5.
Plenty Appreciated ..., Cheerio