Open sjwhitworth opened 10 years ago
@e-dard @Sentimentron
At a minimum, I'd expect
Discretisation and random forests I've got working, working on ID3 now.
I think basic linear models are required: logistic regression, linear regression. SVM integration would be great (w/ libsvm). Cross validation is also essential.
@Sentimentron I'm not sure in what case we will need to use discretization?
ID3, as an example, only works on categorical attributes (C4.5 relaxes this restriction but it's more complex to implement). Similarly, you have to use Gaussian Naive Bayes if you want to handle continuous attributes (it's underlying assumption - that continuous attributes are normally distributed - is not always true).
I think, rather than focussing on features the library needs to reach a specific bar, it's healthier to merely order the features we want in an order to tackle them.
I have an old naive Bayes implementation in Python I could port over as a first step. Could also look at implementing GNB if people think it's important after that.
One class of algorithms that are missing, which I have a few Go implementations of, are Multi-armed Bandits. A very useful reinforcement learning technique. Would be happy to port these into the library.
@Sentimentron: Random forests would be great. I think that someone had already started to implement Naive Bayes..
@e-dard: Agreed. I just think it's useful to have some idea of 'minimal stable functionality' before we start promoting it more widely.
Is someone working on naive bayes? I didn't see anything explicit in the issues list? Was working on a port of my Python implementation.
This is what I've seen so far, but it seems pretty nascent. https://github.com/tncardoso/golearn/tree/feature/naive/naive
Maybe it would be good to sync alongside him.
Any more thoughts? I think:
would be a great first start.
So we now have:
Just leaving
Is the end of June a good target?
That sounds good to me. Logistic regression should be ready to merge after @npbool makes some changes. That only leaves linear regression.
I think we've actually merged everything in that list.
Reckon we're ready to go for a first proper release? Brilliant work @Sentimentron + all.
Are we going to tag before or after #62?
I don't mind. It all looked good to me.
Hi everyone. I'd like to formalise what features we want for a V.01 release. What I mean by this is, is the first version of GoLearn that is nearly ready for production use externally. We'll learn much more when it's in the hands of users. Docs need to be improved substantially, and we need a few more implementations of algorithms.
What does everyone think?
cc: @ifesdjeen @npbool @macmania @lazywei @marcoseravalli