Closed kohleth closed 6 years ago
I ran the above code and I do observe that there is an issue. I think the problem is with the way C5.0 is using the weights not with the party conversion. For example, take the previous code:
> library(C50)
> library(partykit)
> m1=C5.0(Species~.,data=iris,weights=1:nrow(iris))
> summary(m1)
...
(weights) > 100: virginica (83.1)
(weights) <= 100:
:...Petal.Length <= 1.9: setosa (16.9)
Petal.Length > 1.9: versicolor (50)
...
Notice that the weights are being used as a predictor. In-fact, C5.0 gives a factor(0) prediction.
> predict(m1,iris[1,])
factor(0)
Levels: setosa versicolor virginica
Now, to observe that the conversion appears to work, choose the weights in such a way so that they will not be used as a predictor.
> library(C50)
> library(partykit)
>
> m2=C5.0(Species~.,data=iris,weights=c(2,rep(1,(nrow(iris)-1))))
> summary(m2)
...
Petal.Length <= 1.9: setosa (50.7)
Petal.Length > 1.9:
:...Petal.Width > 1.7: virginica (45.7/1)
Petal.Width <= 1.7:
:...Petal.Length <= 4.9: versicolor (47.7/1)
Petal.Length > 4.9: virginica (6/2)
...
> m2p=as.party(m2)
> predict(m2p,iris[1,])
1
setosa
Levels: setosa versicolor virginica
Hi,
I don't know if this has to do with the version of C50, but I am using the version from github (0.1.0-25) and my fitted model does not use weights (see fix on issue #6) :
> m1=C5.0(Species~.,data=iris,weights=1:nrow(iris))
> summary(m1)
...
Petal.Length <= 4.7:
:...Petal.Length <= 1.9: setosa (16.9)
: Petal.Length > 1.9: versicolor (45.6/1.4)
Petal.Length > 4.7:
:...Petal.Width > 1.7: virginica (75.8/0.9)
Petal.Width <= 1.7:
:...Petal.Length <= 4.9: versicolor (2.7)
Petal.Length > 4.9: virginica (9/2.1)
Of course, this doesn't rule out what you are saying -- that it has to do with how C50 handles weight.
Yes, this should be fixed in the github version. I'm having some issues with a CRAN release (arcane C issues) but it should be coming soon.
yes, the issue reported by mvculp does not show up in the github version, but the initial issue i reported is still there.
Ok. So, my understanding is that there was an issue with the weights and it has been fixed recently with C5.0, but the fix was not on the latest R (CRAN) site. The recent fix in turn caused a downstream issue with the party conversion.
I have taken the latest version from GitHub and executed it to get the issue reported (specifically the weights become the response in the new version). I updated the as.party.C5.0 to fix what I believe is the issue.
Is this resolved? I just re-ran the current github version but don't see an issue.
Yes. I think so. I reran the code at the beginning and I get the correct answer.
When weights are used in the C5 model, and then the model is converted to a party object, the conversion does not seem to work.
You can see that in this case, the party object is predicting some numerical value instead of class.