accord-net / framework

Machine learning, computer vision, statistics and general scientific computing for .NET
http://accord-framework.net
GNU Lesser General Public License v2.1
4.48k stars 2k forks source link

Bug Report in Random Forest Algorithm #1002

Open ThomasIE opened 6 years ago

ThomasIE commented 6 years ago

What would you like to submit? (put an 'x' inside the bracket that applies)

Issue description

Dear Cesar Souza,

Hello, I am a Ph.D. student and I really love your algorithm when I do my research about machine learning.

Recently, I have some bugs when I put the NumberOfTrees parameter.

When I try to limit the number of trees in a random forest, the number of trees is always the number of attributes even though I used the NumberOfTrees parameter as 100.

(I have attached a screenshot of the bug.) bug

Is it a bug or misuse? Thanks in advance.

nkildegaard commented 6 years ago

I am having the exact same issue. I am running on a data set with 106 features and I am setting "NumberOfTrees" to 30. I end up having a trained forest with 106 trees.

nkildegaard commented 6 years ago

After looking at the sourcecode I was able to fix this by changing the two lines in RandomForestLearning that looks like this: this.forest = new RandomForest(x[0].Length, this.attributes, y.Max() + 1); into this: this.forest = new RandomForest(NumberOfTrees, this.attributes, y.Max() + 1);

ThomasIE commented 6 years ago

Would you explain the difference between two lines? As far as I understand, is the following line correct? var learner = new RandomForestLearning(NumberOfTrees, this.attributes, y.Max() + 1);

nkildegaard commented 6 years ago

You are doing everything correct in your first question. The problem is that there is a bug in accords implementation of the RandomForestLearning class. This needs to be fixed in accords source code. Since I am not using the source code directly as I am using NuGet I worked around it in a different way. For now I fixed this in my setup by adding a new class to my project called "RandomForestLearningFixed" and then copying everything from https://github.com/accord-net/framework/blob/development/Sources/Accord.MachineLearning/DecisionTrees/RandomForestLearning.cs into the class. I then substituted the faulty lines as described in my previous comment. I then used the new RandomForestLearningFixed in my code instead of RandomForestLearning.

cesarsouza commented 6 years ago

Hi @ThomasIE, @nkildegaard,

Sorry for not being able to reply earlier. The order of the arguments for the RandomForest class had changed and that particular line didn't get updated, thanks a lot for reporting the issue and thanks @nkildegaard for instructing how to workaround it. I will be applying the fix for the next pre-release.

Thanks again, Cesar