Open A-Pai opened 3 years ago
@A-Pai most of what you’re asking is explained in the associated vignette, which you can find on the CRAN landing page here.
@A-Pai most of what you’re asking is explained in the associated vignette, which you can find on the CRAN landing page here.
Thank you for your guidance,but is it there? I can not find it,Or please give me a screenshot
@A-Pai Agreement and adjusted agreement are used in the (default) computation of variable importance in rpart. Take a look at section 3.4.
primary splits look at each of the listed variables and describes how well they do at improving the node purity. surrogate splits are for when there are missing values. The "agree" measure looks at how well the surrogate splits would give the same split as the first primary split that is listed, then uses that surrogate instead of the primary variable for the missing values. So in this example, height < 3.5 is used instead of weight < 5.5 for the 1 missing value. Look at a table with (height < 3.5) vs (weight < 5.5).
From: lg21c @.> Sent: Friday, October 15, 2021 9:59 PM To: bethatkinson/rpart @.> Cc: Subscribed @.***> Subject: [EXTERNAL] [bethatkinson/rpart] what is the Meaning of Surrogate Split? (#34)
tmp_df <- data.frame( Y = as.factor(c(1, 1, 1, 1, 1, 0, 0, 0, 0, 0)), weight = 10:1, height = c(10:7, 5, 6, 4:1) )
tmp_df$weight[3] <- NA tm <- rpart(Y ~ weight + height, data = tmp_df, control = rpart.control( minsplit = 1, minbucket = 1, cp = 0, maxdepth = 1 ) ) summary(tm)
[image]https://user-images.githubusercontent.com/5112397/137571069-05345d10-eb24-447a-9c16-6dd0c21c20ec.png
1、what is the meaning of “Primary splits” and “Surrogate splits”? 2、
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/bethatkinson/rpart/issues/34, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACWQG5YFOKBTMMFEOMED3EDUHDTCNANCNFSM5GDEFMOQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
primary splits look at each of the listed variables and describes how well they do at improving the node purity. surrogate splits are for when there are missing values. The "agree" measure looks at how well the surrogate splits would give the same split as the first primary split that is listed, then uses that surrogate instead of the primary variable for the missing values. So in this example, height < 3.5 is used instead of weight < 5.5 for the 1 missing value. Look at a table with (height < 3.5) vs (weight < 5.5). … ____ From: lg21c @.> Sent: Friday, October 15, 2021 9:59 PM To: bethatkinson/rpart @.> Cc: Subscribed @.***> Subject: [EXTERNAL] [bethatkinson/rpart] what is the Meaning of Surrogate Split? (#34) tmp_df <- data.frame( Y = as.factor(c(1, 1, 1, 1, 1, 0, 0, 0, 0, 0)), weight = 10:1, height = c(10:7, 5, 6, 4:1) ) tmp_df$weight[3] <- NA tm <- rpart(Y ~ weight + height, data = tmp_df, control = rpart.control( minsplit = 1, minbucket = 1, cp = 0, maxdepth = 1 ) ) summary(tm) [image]https://user-images.githubusercontent.com/5112397/137571069-05345d10-eb24-447a-9c16-6dd0c21c20ec.png 1、what is the meaning of “Primary splits” and “Surrogate splits”? 2、 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#34>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACWQG5YFOKBTMMFEOMED3EDUHDTCNANCNFSM5GDEFMOQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
----what is the meaning of “to the left”?
the tree growed by the code:
tmp_df <- data.frame( Y = as.factor(c(1, 1, 1, 1, 1, 0, 0, 0, 0, 0)), weight = 10:1, height = c(10:7, 5, 6, 4:1) )
tmp_df$weight[3] <- NA tm <- rpart(Y ~ weight + height, data = tmp_df, control = rpart.control( minsplit = 1, minbucket = 1, cp = 0, maxdepth = 1 ) ) summary(tm)
1、what is the meaning of “Primary splits” and “Surrogate splits”? 2、Where does the “Surrogate splits height < 4.5” come from? 3、what is the meaning of “to the left”? 4、what is the meaning of “agree” and “adj”?