Prashant-Jonny / accord

Automatically exported from code.google.com/p/accord
0 stars 0 forks source link

C4.5 Learning algorithm produces a partition with only one element and thus breaks the recursive call #35

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?

1. Decision tree with 6 continuous variables, 1 discrete  
2. Learn the tree
3. Exception at C45Learning.cs:274

The reason for this is that the maxGainPartition array contains only one value, 
and the algorithm expects two. Causes an IndexOutOfBounds exception.

It seems all partitions are assumed to be split in two but for some strange 
reason my data produces one partition with only one subset. 

What version of the product are you using? On what operating system?

2.8.1, Windows 7, 64-bit.

Attached is my training data in a flat format. First column is irrelevant, last 
column is output (yes/no). All except the penultimate column are continuous 
decision variables.

Original issue reported on code.google.com by anhek...@gmail.com on 11 Feb 2013 at 12:38

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
Thanks. I will investigate! By the way, I couldn't find the data you attached. 
Can you please attach it again?

Regards,
Cesar

Original comment by cesarso...@gmail.com on 11 Feb 2013 at 1:48

GoogleCodeExporter commented 8 years ago
Here it is in a comment. I was wondering why it did not show up as well.

Original comment by anhek...@gmail.com on 11 Feb 2013 at 1:48

Attachments:

GoogleCodeExporter commented 8 years ago
Hi anhekalm!

The 'strange reason' is that one of the columns is constant. If you check the 
data, the column with values 0.323624595469256 always have the same value: 
0.323624595469256. That is why the algorithm can't partition this column. I'll 
add a more descriptive error message or handle this case better.

Please see if it works after either removing this column or introducing more 
variability in this variable.

Regards,
Cesar

Original comment by cesarso...@gmail.com on 11 Feb 2013 at 7:13

GoogleCodeExporter commented 8 years ago
Oh, bollocks! Well, thanks for pointing that out.

Original comment by anhek...@gmail.com on 11 Feb 2013 at 7:15

GoogleCodeExporter commented 8 years ago
This issue was closed by revision r477.

Original comment by cesarso...@gmail.com on 23 Feb 2013 at 4:49