Closed igorkasyanchuk closed 7 years ago
Take a look at the intro blog post, that should cover the basics: https://www.igvita.com/2007/04/16/decision-tree-learning-in-ruby/
In terms of training, the more data the better. When you test your model, make sure to avoid overfitting; look into using cross-validation: https://en.wikipedia.org/wiki/Cross-validation_(statistics)
Hello, This is not an issue, it's more like a question about real usage. For example, I would like to use this gem on my projet. I've basically training data like this
Header: Client, Worker, Technology, Status Data with past Requests: Client A, John, Ruby, Won Client A, John, Ruby Won Client B, Bob, Java, Won Client B, John, Ruby, Lost Client C, John, HTML, Lost Client C, Alice, HTML, Won Client C, Bob, HTML, Lost ....
Now, what I want to do when new Request I want to advice who is the best worker for it. For example, what if I assign "Bob" for new Client what the chance that status will be "Lost".
I hope you got the idea.
How many records do we need to have? What to do if we Request has many technologies? Duplicate rows?(with same client, status?)
Thanks